10 min read

Bot Taxonomy

Banner image

How do different kinds of automation on the internet coexist?

Automation Bots

Bots and automation have been something I’ve been fascinated with ever since I started programming. I got into it by looking into Discord bots and quickly learned about things like scraping and other kinds of automation to add new functionality to different tools I was working on at the time. I eventually ended up getting into building scalpers to pay for my college tuition, which was an interesting experience but not something I enjoyed doing (like college itself). Today, I still work in a field that involves botting, just on the other side of the table.

There are many different kinds of bots whose tasks range from crawling the internet to make websites accessible for everyone to making businesses lose as much money as possible. The story of bots on the internet is one made up of protagonists, antagonists, and countless supporting characters; villains who try to pillage metaphorical cities for gold and wreckless heroes who put everyone else at risk in order to save the day.

So, how common are bots on the internet? Well, it’s hard to tell, but it seems that:

62% of all web traffic comes from bots

I’ll admit, that’s a slightly dubious number. There are other figures from different journals that are way off from this one but seeing as there are so many bots out there, I figured I’d like to categorize them and the rich ecosystem they exist in to see what they’re all about.

I will be using an extremely scientifically rigorous model created by the brightest philosophers and botanists of our generation to explain where I place different aspects of the automation ecosystem in (my) moral system.

Google & other search engines pretending to be relevant

  1. Googlebot
  2. DDG Crawler
  3. Bingbot

Bot protection as a service

  1. Cloudflare
  2. Datadome
  3. Imperva
  4. Akamai

Port scanning as a service

  1. Shodan.io
  2. The security researchers who keep pinging my server like a million times a week with masscan

Home automation

  1. IoT (Internet of Things)
  2. Home assistant

Independently run bots for small-scale automation

  1. The 50 line python code your friend wrote to make an anime command for his Discord bot

Proxy and abuse tooling vendors

  1. Oxylabs
  2. Luminati
  3. 2Captcha

Botting as a service

  1. GPU Scalpers
  2. Sneakerbots
  3. Votebots
  4. Automated account cracking

Purely malicious hacking and/or service denial

  1. DDOS
  2. Destructionware

😇 Lawful Good

The good guys.

Any automation that is beneficial to you and conforms to how you expect it to behave with pure intentions. These are generally your standard search engines. They’re the reason why the internet works in the first place. Aside from a few ridiculous crawlers like Baidubot, they will go out of their way to ensure your site isn’t affected by what they’re doing.

They’re also the reason why it can be challenging to protect yourself against unwanted automation. Sometimes a good solution for blocking bad bots you don’t want might need to be reconsidered if it also blocks the “good bots” which would ultimately be a net negative for your site.

😇 Neutral Good

The blue team.

These companies/services will ensure that Neutral Evil does not manage to automate your website and cause harm. They fight for good, but the methods that they use to provide their services result in privacy violations for real users through the use of things like browser fingerprinting and aggressive data collection.

Protection like Google’s ReCaptcha or Binance’s weird puzzle slider thingy also often make it difficult and sometimes impossible for a lot of users who rely on accessibility tools like screen readers to continue using the site.

binance's slider puzzle for logins

Bots are very hard to distinguish from users who use the same technologies for legitimate purposes. If you want to make things harder for bots, you inevitably make things harder for accessibility tools, and any bypass you add to help these users will be abused by bots who pretend to be legitimate users requiring accessibility. This often puts fighters in this category in a very tricky position of lose-lose where there’s just no good solution to the problem that doesn’t involve major compromises.

Neutral Good is in a constant battle with Neutral Evil as well as regular users who feel like they’ve become collateral damage in the name of web security and have come to expect it to work like magic without involving any drawbacks.

😇 Chaotic Good

The kinds of automation that are ultimately for the benefit of everyone but have to be exercised with caution.

One of the best examples of this is probably Shodan.io . They are constantly scanning your infrastructure and exposing the tools running on it like databases, SSH access, FTP servers, and more. If you use it properly, it’s a tool that can be used to detect unwanted exposure, but it’s also a tool that allows bad actors to uncover information about your infrastructure that could’ve otherwise stayed hidden.

It’s not a good practice to rely on security through obscurity, but there’s definitely an argument to be made for how Shodan can be abused for the same reasons it can be a useful protection tool.

🙄 Lawful Neutral

Overall useful utilities with no specific benevolent or malevolent purpose.

IoT devices in specific have no specific alignment other than doing what they need to be doing, which is generally positive. They don’t provide a service in general that can be considered inherently malicious or benevolent.

🙄 True Neutral

Mostly uninvolved 3rd parties.

These people aren’t really motivated in any particular way. They just want to go about their day without having to deal with a bunch of stuff. This is usually people who are just trying to create something new for the sake of learning or for fun, and often get caught in the crossfire between abuse and protection.

🙄 Chaotic Neutral

The services who provide ammunition for everyone else.

There’s a huge part of the bot ecosystem built around vendors that sell different kinds of tooling that can be purchased to execute abuse at a bigger scale. The most common examples of these are services like Luminati (I guess they call themselves Bright Data now) or Oxylabs which will sell you proxies that “lend” you their IP address which can be used to bypass anti-abuse measures like IP bans.

There are other ones like 2Captcha where you can pay a child in Bangladesh 3 cents to click on fire hydrants in captchas for you so you can deploy an army of automated bots to get past human checks. This is why captchas are borderline useless on sites targeted by botters who have the money to pay for these kinds of 3rd party captcha services.

Similar to selling guns on the black market to unknown buyers, Chaotic Neutral doesn’t technically carry out any of the abuse themselves, instead enables another party to do it in exchange for money. The morality here is questionable at best, and they’re well aware of what their target market is.

😈 Lawful Evil

I don’t actually know what fits into this category.

Any technology or service that is malicious enough to cause problems is motivated by money or something equally as selfish. I can’t think of any sort of destructive automation in the wild that claims to be self-righteous to justify its actions. The first thing that comes to mind is something like Anonymous, but they don’t fit into the category of automation.

😈 Neutral Evil

The red team.

People using automation in this category do not have malicious intentions or want to cause harm to the services they target. However, they believe that they have the right to obtain what they want without any regard for rules and do not care about whether or not their actions will harm the other party involved.

This is the category of bots that cause the price of GPUs to skyrocket, get your Roblox account hacked, purchase the limited edition Supreme crowbar you wanted, and add millions of fake views to videos on YouTube.

The industry behind these bots has grown to such a scale that most people have started considering it a fact of life. Speaking of sneakerbots and scalpers in specific, they’re sold as a service to users who want to run their own bots to buy limited edition stuff and have a rich support system backing up their customers. They’re run by dedicated developers who charge a subscription fee and always stay up to date with the latest changes to sites to make sure users of the bot can continue to bypass new protections.

For people at the top tiers of this category, it’s not a matter of whether you’ll be able to stop them, but for how long. When the person abusing your site makes a million dollars a year offering their services, there’s not a whole lot you can throw their way to tire them out.

Neutral Evil generally believes that they’re not doing anything wrong; instead, they think they’re simply hustling and that it’s up to everyone else to step up their game instead of being forced to stop.

Despite not being motivated by malice in specific, they’re ready to go to similar extents as Chaotic Evil to get what they want.

😈 Chaotic Evil

The bad guys.

These are the actors whose actions center around bringing harm to the services they target. Whether it’s caused by revenge or something else, the automation in this field primarily serves to inflict damage to the victim without an obvious direct benefit to the actor. This mostly includes things like botnet DDoS or destructionware.

Surprisingly, there’s not a whole lot that falls into this category. Unlike anime supervillains, most people in real life are not motivated by pure evil; there’s almost always an underlying motivation that explains people’s actions beyond just being evil and wanting world domination.

I’m a huge fan of this little world we’ve built around automation and the endless cat and mouse game between the two parties. Hopefully, I’ll have the opportunity to work more in-depth in this field and learn more about different kinds of ways to keep raising the bar of competency needed from bot developers to abuse services.