Features and Benefits
Bad Behavior is designed to integrate into your PHP-based Web site, running as early as possible to throw out spam bots before they have the opportunity to vandalize your site with their junk, or even to scrape your pages for e-mail addresses and forms to fill out.
Not only does Bad Behavior block actual vandalism to your site, it also blocks many e-mail address harvesters, resulting in less e-mail spam, and many automated Web site cracking tools, helping to improve your Web site’s security.
Bad Behavior runs before your software on each request to your Web site, so if a spam bot does visit, it will receive nothing, and your software never runs. This reduces the amount of server CPU time, database activity and bandwidth spent on processing robots which are just harvesting your site and delivering junk.
Bad Behavior rejects spam bots outright, sending an appropriate 4xx error code. This lets you filter them out of your server’s logs when you do log analysis, making them cleaner and more accurate and giving you better insight into the human beings visiting your site, rather than the spammers.
Bad Behavior is fully compatible with reverse proxies, HTTP accelerators, load balancers and content distribution networks. It is fully Section 508/WAI compliant. And it stores personally identifying information for a maximum of seven days, (it is usually not stored at all) making it compatible with virtually any corporate or government privacy requirements.
Bad Behavior is designed as a platform-independent package which uses a connector to integrate with a given software package (MediaWiki, WordPress, etc.). This lets Bad Behavior run on a very wide variety of Web applications, including personalized custom scripts you may have written. With some Web servers, Bad Behavior can even be used to protect static HTML pages.
How it Works
It’s black magic.
Bad Behavior manages to block nearly all link spam without ever looking at the spam. While it might be useful to do so, for performance reasons, Bad Behavior does not analyze received spam. I’ve found that this way lies madness; spammers are constantly buying new domain names, so it’s possible to miss a lot of spam by looking at it.
Instead, Bad Behavior pioneered an HTTP fingerprinting approach. Instead of looking at the spam, we look at the spammer. Bad Behavior analyzes the HTTP headers, IP address, and other metadata regarding the request to determine if it is spammy or malicious. This approach has proved, as one user said, “shockingly effective.” After all, spammers write their bots on the cheap, and have little incentive to code very well. If they could code very well, they probably wouldn’t be spammers.
When Bad Behavior looks at a request, it determines if the request matches a profile of known malicious or spammy activity, and falls outside the bounds of a normal human browsing the web. If so, the request is blocked. But a way out is provided for any human beings with unusual configurations or viruses/Trojans on their computer who may be blocked.
From the start, Bad Behavior has had two overriding design requirements. The first is that it must be fast. Users will get annoyed by waiting around for their traffic to be screened for spamminess. (Is that a word?) Especially since Bad Behavior screens all requests in order to block email harvesters and certain malicious robots, speed is paramount. I’ve had to abandon good ideas because they would add significantly to Bad Behavior’s run time, which is typically measured in milliseconds, and can be cut to hundreds of microseconds for very high traffic sites.
The second requirement is that it must block as few legitimate users as possible, and when one is blocked, they must be able to unblock themselves through an action simple and fast enough that they can simply hit the browser’s reload button once they’ve completed the action. Bad Behavior provides a technical support key to each blocked request which allows the requester, if it’s a legitimate human being, to get immediate, self-service support to fix the problem (e.g. virus removal, change of browser preference, etc.) and go back to browsing. Out of countless millions of requests served daily, an average of 50 people use the technical support system, and virtually all of those resolve the problem themselves in under five minutes.
Spam Prevention Strategy
Despite the best efforts of the brightest minds on the Internet, spam isn’t going away anytime soon. (We just haven’t figured out how to deliver electric shock over the Internet yet.) And to be most effective at blocking it, you may need to apply a variety of techniques.
Bad Behavior is completely different from any other anti-spam solution out there, in that it doesn’t specifically target spam itself. Rather, it targets the methods by which the spam is delivered. Until I released the first version in 2005, this approach had never been tried. It proved very effective at stopping a lot of malicious activity, not just spam: It also blocks many email address harvesters, meaning less e-mail spam, and some types of automated cracking attempts, improving your server’s security.
While a somewhat similar solution called mod_security exists, it has a rather different purpose, doesn’t target spam, and regular people can’t install mod_security on their shared web hosting accounts. Bad Behavior blocks spam as well as other malicious activity and can be installed by anyone.
On some high traffic sites, or those specifically targeted by spammers, the traffic from these spam attacks can be so excessive as to exceed your account’s bandwidth limits, or overload the server, and cause your account to be suspended. Bad Behavior helps to prevent both of these situations by blocking malicious activity as soon as possible, before either bandwidth or CPU are expended on a request which will turn out to be bogus.
But because Bad Behavior intends to block no legitimate users whatsoever, it must necessarily let some things pass. Consider it your first line of defense, and back it up with a secondary line of defense in the form of a more traditional anti-spam tool for your platform. For WordPress, this can include Akismet or Spam Karma 2.
You absolutely should use both, as what will happen if you use only the secondary line of defense is that your administrative screen will rapidly fill with so much spam that you won’t be able to find and recover the occasional legitimate comment that those tools block. By blocking most spammers before you ever see it, the amount of garbage you have to sift through to find legitimate comments, or the number of edits you have to revert on your wiki, is greatly reduced.
In this way Bad Behavior saves you time and frustration and gives you peace of mind by turning spam from a colossal nightmare into, well, not much at all.