Archive for the 'Drupal' Category
It’s that time again.
Time for me to take Bad Behavior, throw its core engine away and rewrite it from scratch.
For the second time.
Why?
As of now, Bad Behavior is shockingly effective, as one user said, at blocking automated spam and other malicious activity. However, that doesn’t catch all possible spam. There’s one important class of automated spam I would like to catch but cannot right now: that is delivered from hijacked Web browsers. This accounts for virtually all of the spam that Bad Behavior currently misses.
I believe I have a good strategy for catching this class of spam, but Bad Behavior’s current design won’t accommodate it.
In addition, there are a number of features which were pushed to post-2.2 because the current design won’t easily accommodate them either.
Thus, it’s time to redesign Bad Behavior.
I’ve already begun laying out Bad Behavior 3, and depending on the time available to me, I hope to have an alpha quality release by the end of the month.
To make that happen, though, I need your help right now.
Core Changes
Bad Behavior 3 will include an automatic update facility which you will be able to use to keep Bad Behavior up to date automatically, even if your host platform does not allow for automatic updates. For host platforms that have their own automatic update process, such as WordPress, you will be able to choose which process you want to use to keep Bad Behavior updated. Updates distributed with this new method will be protected via digital signature.
Bad Behavior 3 will support internationalization and localization, with translations available in as many languages as I can find translators for. Bad Behavior will use the PHP gettext extension, which is available on virtually all platforms including Windows, for core i18n/l10n. I will make a call for translators sometime in the next few days.
As of now, Bad Behavior 3 will require PHP 5.2 or later. If your server is still running some antique server software, now is a good time to update it.
Platform Connector Changes
Bad Behavior’s platform connectors will also be completely redesigned; the API for version 3 is completely different from version 2 and will not be backward compatible. The new design will enable features on various platforms which were difficult or impossible before, such as PostgreSQL support and a special page on MediaWiki, better Drupal 6 and 7 integration, a 100% functional generic platform out of the box including SQLite support, and many other things.
In cases where platform connectors provide platform-specific text, Bad Behavior will use the host platform’s i18n/l10n functions instead of using gettext directly. This will put some extra work on both myself and translators, but it is necessary to ensure maximum compatibility with all possible host platforms.
The integrated administrative pages which currently exist for WordPress will be generalized as much as possible, so that their functionality can be provided for multiple host platforms. Because every platform has a unique method of handling administrative pages, this may not be complete for all platforms at 3.0 release. At minimum, though, platforms which provide such administrative pages should all be able to change Bad Behavior’s settings and manage a whitelist through such a page.
For platforms capable of it, a second administrative page will allow full searching through Bad Behavior’s database, as the WordPress port does today, in addition to export functionality which you will be able to use to send me copies of spam you have received or other traffic that you think should have my attention. This export process will be released for WordPress shortly. As always, I hold such submissions in strict confidence, on encrypted media, use them solely for security analysis, and destroy them within 90 days. I never use personally identifying information which might be present in such submissions.
Database access has been generalized further so that different platforms can provide database access in their own unique ways. This new design is highly database agnostic, and fairly closely resembles Drupal’s database abstraction. It allows for the use of almost anything from SQLite to Oracle and much in between, including the use of database masters/slaves as in MediaWiki.
Development Process Changes
The test suite planned for Bad Behavior 2.2 never quite took shape, which has resulted in several embarrassing incidents where code was released that still contained obvious errors and typos. With Bad Behavior 3, I will be building the test suite (using PHPUnit) alongside the code, ensuring 100% coverage and hopefully this will make for more stable releases.
I have set up a completely new development environment which is linked to github. Github will be the primary source code repository for Bad Behavior 3, and from it, release engineering scripts will test the code and construct releases for all available platforms. These releases will then be offered for download here and pushed to third party download sites such as WordPress. This process flow should virtually eliminate releases with syntax errors and obvious regressions.
Using github will also allow me to integrate more closely with third parties who develop platform connectors, by pulling in their updates as they make them available and by providing users with a single download regardless of host platform. I’ll be providing more details on this as work progresses.
Spam Prevention
The new techniques for blocking spam from hijacked web browsers which I mentioned above will be incorporated into Bad Behavior 3.
I am currently working on a ruleset-based design which will allow for Bad Behavior’s spam blocking rules to be distributed independently of the core and the platform connector. This will simplify most updates and allow for environments which restrict updates, such as enterprise installations, to still keep up to date on spam blocking rules. Again, these updates will be protected with digital signatures.
A feature planned for 2.2 was to allow Project Honey Pot users to provide honey pots or QuickLinks on their web sites. This is still something I want to do, and the new platform connectors should make it possible. No guarantees on this, though.
Status
As I write this, the display at the bottom of this page says Bad Behavior has blocked more than 19,000 access attempts in the last week, on this site alone. In that same time, 34 messages got through and were caught by Akismet, which I use as my secondary spam plugin.
Now I’m after those last 34.
But, as I mentioned above, I need your help.
One night back in early 2005, when I first started blogging, I got my first comment spam. Unfortunately, my first comment spam was followed by 700 more over the space of a few hours. As you can imagine, I was thoroughly pissed. I spent some time looking at anti-spam solutions, but at the time there wasn’t much, and what there was didn’t work all that well. I felt I had to roll my own. A couple of months later, Bad Behavior was born.
I still clearly remember cleaning up after that first incident, and killing link spam has become something of a personal crusade for me. But I’ve learned that I can’t possibly do it all alone. Fortunately this field has grown significantly and there are now a whole lot of smart people working on various aspects of the link spam problem. What Bad Behavior brings to the table is to take that 700 spam attack and allow fewer than one percent to reach your blog. Having to clean up 7 spam is much easier than cleaning up 700. (This is one reason why I advise using more than one anti-spam solution.)
As new spammers start up and new botnets come online, some find themselves already blocked, while others need to be analyzed and updates made to block them, so Bad Behavior will always require continuous development. Often this development is delayed because I have to pay bills. As you may be aware if you’ve been a very long time user, I lost my job in 2005 and since then I have lived on revenue from blogging and paid web consulting work. Therefore I can only work on Bad Behavior when my finances permit.
Historically, keeping up with the spammers has not been that difficult, as there is only so much the spammers can do while maintaining their high rates of spamming. Today, 100,000 or more spams in a single run is not unusual, and one spammer I’ve blocked can send 1,000,000 in a day. Bad Behavior attempts to drive up the cost of link spamming by blocking as many automated spammy requests as possible, forcing the spammers to resort to much slower manual methods, or ideally, give up and find more honest work. And Bad Behavior 3 promises to cut into the spam delivered by those much slower methods.
Only one thing remains, and that is to do the work. As I have noted before, Bad Behavior is a user-supported project. If you think this roadmap looks good, and want to accelerate Bad Behavior development, your financial contribution will help ensure that I can devote more time to its development and bring it to fruition much faster. Otherwise, I have to spend my time first on other work which brings in revenue, and that means it will be much longer before you see these features.
I would estimate that all of the above would take me about six months to complete if it isn’t otherwise funded. At the same time I think contributions totaling $500 or more would allow me time to complete the majority of the above within a month. I know that a lot of you are having financial trouble due to the economy; so am I. Even if you are unable to send a contribution, please leave your comments so that I know you support Bad Behavior and wish it to continue. And, thank you to all of you who have sent in contributions recently.
This is also the time to send in feature requests. If Bad Behavior doesn’t do something you would like it to do, please leave a comment. (And remember that feature requests accompanied by a contribution are more likely to be implemented sooner.)
On that note, if you know someone who needs custom code written for WordPress, you should also contact me.
Thank you again for your support, and here’s to a future without spam.
P.S. I am still looking for someone who knows how to deliver electric shocks over the Internet. If you do, please contact me. This could be the ultimate spam-prevention feature.
Since the first release of Bad Behavior four years ago, tens of thousands of WordPress users have used it to protect their sites from the scourge of link spam. Bad Behavior’s second major release, just a year after the first, was a major redesign that has stood the test of time. Bad Behavior became even easier to port to other web site platforms as well as easier to add new features and block new spam.
Now the design needs a few tweaks. This work will eventually become Bad Behavior 2.2. Today I want to update you on some of the changes Bad Behavior needs and what I’m planning for the 2.2 version.
As I noted with today’s 2.0.32 release, development of the 2.0 branch has been limited to bug fixes and security issues so that I can concentrate development on this new version. The development will take place in versions numbered from 2.1. As a development branch, it won’t be appropriate for everyone, but many of you will be interested in following its progress.
Before I get into the details of the roadmap, there’s something I haven’t talked about in a while and should probably do again. Bad Behavior has been a personal project of mine for almost five years now. It was born out of an incident, a couple of months after I started blogging, where I got my first comment spam. Unfortunately, my first comment spam was followed by 700 more over the space of a few hours. As you can imagine, I was thoroughly pissed. I spent some time looking at anti-spam solutions, but at the time there wasn’t much, and what there was didn’t work all that well. I felt I had to roll my own. A couple of months later, Bad Behavior was born.
I still clearly remember cleaning up after that first incident, and killing link spam has become something of a personal crusade for me. But I’ve learned that I can’t possibly do it all alone. Fortunately this field has grown significantly and there are now a whole lot of smart people working on various aspects of the link spam problem. What Bad Behavior brings to the table is to take that 700 spam attack and allow fewer than one percent to reach your blog. Having to clean out 7 spam from the moderation queue is much easier than cleaning out 700. (This is one reason why I advise using more than one anti-spam solution.)
The main technique Bad Behavior uses to accomplish this is to block bots which scrape your site to get access to your comment forms, login forms and other such forms on your site. Once a bot has the form, it can pass it around a botnet and send dozens of spams to that page from all over the world. Preventing malicious bots from accessing the forms in the first place stops the majority of spam. The remainder is a variety of techniques used to identify poorly coded bots which imperfectly masquerade as legitimate web traffic.
As new spammers start up and new botnets come online, some find themselves already blocked, while others need to be analyzed and updates made to block them, so Bad Behavior will always require continuous development. Often this development is delayed because I have to pay bills. As you may be aware if you’ve been a very long time user, I lost my job in 2005 and since then I have lived on revenue from blogging and paid web consulting work. Therefore I can only work on Bad Behavior when my finances permit.
Today my finances do not permit me to do any further work on Bad Behavior, mainly due to the economic recession. If you want this work to continue, as I’ll outline in the roadmap below, skip your morning latte tomorrow and send me a financial contribution. The amount is blank, so fill in whatever you feel is appropriate.
And if you see any problems with the roadmap, or feel it could be improved, feel free to comment below.
Core Changes
The most important change won’t be visible right away. A design change to the core is needed to enable Bad Behavior to be tested using more rigorous test methods. The earliest 2.1 releases will contain this change and I will write tests for each of Bad Behavior’s existing checks. Before the 2.2 stable release, and going forward, a test will be written for each feature introduced into Bad Behavior, to help prevent obvious and silly bugs which require almost immediate updates to fix, as happened with 2.0.30 through 2.0.32. The test suite which emerges from this work will ship as a downloadable package, so that you can test Bad Behavior yourself. (Thanks to Tony Bibbs for suggesting this change.)
Bad Behavior’s various whitelists will be moved out of the core and into a separate file template, downloaded separately from Bad Behavior. This will allow you to update Bad Behavior without disturbing your personal whitelists. This is currently an issue for all platforms. On platforms which support an integrated administrative page for changing Bad Behavior’s settings, and can store settings in the host platform’s database, the whitelists will be manageable from within the administrative page.
Platform Connector Changes
On platforms which do not support an integrated administrative page for changing Bad Behavior’s settings, and require settings to be placed in the platform connector’s file, these settings will be placed in a separate file, downloadable separately from the platform connector. This will allow for the incorporation of settings for new features without updating the platform connector, or conversely, updating the platform connector without disturbing your settings. This is currently an issue for the Drupal module, MediaWiki extension, and possibly other platforms.
The integrated administrative page will be introduced for more platforms. I had originally intended to write this myself for MediaWiki, whose platform connector I maintain, but the lack of adequate developer documentation had made it virtually impossible. (The documentation seems to have improved greatly since then, so I’m going to make another attempt at it.) I expect that these are going to be highly specific to the platform and that little code can be shared between them. If you maintain a platform connector and need assistance with implementing this, please contact me.
The integrated administrative page will be enhanced to allow more complex searching through the database records. Currently it is not possible to search the records except by manually crafting a URL. In the future the entire database will be searchable and you will be able to mark records and forward them to me for analysis. Due to privacy concerns, records sent to me are kept on encrypted media at all times, used solely for analysis of how to permit or block similar traffic (as appropriate) and destroyed within 90 days. Personally identifying information, if present, is not used. I have done this since the beginning.
The current list of platform connectors needs to be updated; it’s come to my attention that some are out of date or their maintainers have stopped maintaining them. If you are, or want to be, a maintainer for a platform connector, please contact me.
The code which creates the database in a new Bad Behavior installation is currently in the core; however, it properly belongs in the platform connector, since it can vary by platform. For instance, the Drupal module already uses its own code for this, but the WordPress and MediaWiki connectors share the same code. This code will be moved out of the core and split into separate files to facilitate reuse where possible, give a slight performance gain, and enable other platforms to do their own initialization where needed.
I’ve identified several new situations in which it would be useful for Bad Behavior to call back to the platform connector to have the host platform perform some action or another. As a result, the platform connector API, such as it is, will expand. It will remain backward compatible, however, in case some platform does not or cannot implement the complete API.
The porting documentation needs to be greatly reworked and expanded. It doesn’t say much except to look at the existing code and base your work off of it, which is perhaps fine for some experienced programmers, but not for everyone.
Bad Behavior needs to be localized, that is, translated into languages other than English. This is still an open design issue, since each platform handles localization in a completely different manner and requires files containing localized translations to be installed in different places. The most likely solution at this point will involve “language packs” which you will be able to download separately from the core. In addition, people will be needed to help translate Bad Behavior. I will make a separate post when I’m ready to accept translations.
Spam Prevention
The core design change mentioned above, which will allow for improved testing, will also enable some new features which haven’t been implementable before, such as improved whitelisting of search engines. As you may know, Bad Behavior has been using the http:BL service from Project Honey Pot to detect spammers for some time now (if you enabled the feature). The http:BL service also identifies many different search engines and can be used to whitelist them, preventing such issues as the recent blocking of msnbot when it began using a suspicious user-agent string. This feature will be available for testing early in the 2.1 release cycle. The original methods of identifying major search engines will remain in place and be maintained for those who cannot use http:BL.
Speaking of Project Honey Pot, Bad Behavior will allow you to serve spammers honey pots or QuickLinks provided by the service, so that it can catch even more spammers.
A screener which uses JavaScript and cookies to identify legitimate users has been in Bad Behavior since the initial 2.0 release, but proved difficult to implement, as it required calls into the host platform which weren’t always available or didn’t work as expected. This feature has been disabled for years. I will finally revisit this technique, as I think there’s still some value in this approach.
And of course I will continue to kill spammers as they come across my radar screen.
Other
Bad Behavior’s documentation has always been less thorough than I would like. It will have to be revamped. In addition I will have to keep on top of it by writing documentation for new features as the new features are written, rather than afterward. Documentation will also need to be translated, and I will need your help for that. I will make a separate posting when I am ready to accept translations.
On many platforms, users currently have to download the Bad Behavior core, then the platform connector, and then upload them together on their web site. If not done perfectly, this can result in errors, or a completely broken site. Where possible, I plan to have a build system which, upon each release of the core, combines it with the platform connector for each platform, an optional language pack, as well as files such as the whitelist and settings templates mentioned above, creating a single download. This should make installing and updating the software more convenient and less error-prone for users of affected platforms.
Finally, I made a proposal long ago for Bad Behavior to automatically update itself. This is not appropriate for everyone, of course, but it may be useful for people on platforms which don’t provide update facilities for their plugins/extensions. This is still a post-2.2 change, though I want to do some preliminary work to see if it can be done reliably and what might be necessary to accomplish it.
I’ve also probably forgotten a few things. They’ll be announced when I remember them.
Status
Bad Behavior must continue to keep up with spammers as they attempt to adapt and find new ways to post their automated garbage. Historically, keeping up with the spammers has not been that difficult, as there is only so much the spammers can do while maintaining their high rates of spamming. Today, 100,000 or more spams in a single run is not unusual, and one spammer I’ve blocked can send 1,000,000 in a day. Bad Behavior attempts to drive up the cost of link spamming by blocking as many automated spammy requests as possible, forcing the spammers to resort to MUCH slower manual methods, or ideally, give up and find more honest work.
I believe the proposed changes outlined above will make Bad Behavior a much stronger tool for preventing link spam while at the same time making it more accessible to a wider variety of users and web site platforms.
Only one thing remains, and that is to do the work. As I noted before, Bad Behavior is a user-supported project. If you think this roadmap looks good, and want to accelerate Bad Behavior development, your financial contribution will help ensure that I can devote more time to its development and bring it to fruition much faster. Otherwise, I have to spend my time first on consulting and other work which brings in revenue, and that means it will be much longer before you see these features.
I would estimate that all of the above would take me about six months to complete if it isn’t funded. At the same time I think contributions totaling $500 or more would allow me time to complete the majority of the above within a month. I know that a lot of you are having financial trouble due to the economy; so am I. Even if you are unable to send a contribution, please leave your comments so that I know you support Bad Behavior and wish it to continue.
This is also the time to send in feature requests. If Bad Behavior doesn’t do something you would like it to do, please leave a comment. (And remember that feature requests accompanied by a contribution are more likely to be implemented sooner.) Due to a hard drive crash I’ve lost all email that was sent to me before August of this year, and possibly some more recent email as well. If you have emailed me with a feature request recently, and don’t see it included above, please also leave a comment.
Thank you again for your support, and here’s to a future without spam.
P.S. If anyone knows how to deliver electric shocks over the Internet, please contact me. This could be the ultimate spam-prevention feature.
This article applies to the 2.x.x series of Bad Behavior. If you are using a 1.x.x version of Bad Behavior, please update as soon as possible.
One of the two topics I get most frequently is the assertion that Bad Behavior has blocked a legitimate request from an actual user, sometimes even the owner of the blog! Since this seems to come up every so often, I’m going to see if I can help out, and maybe eliminate the need for some of these folks to contact me.
(But before we get started, if you are an AOL user, do not use the built-in AOL browser. Use
Firefox or something else. And get a real ISP as soon as possible.)
Before doing anything else, ensure that you have the latest version of Bad Behavior. Do not leave a comment or contact me if you have failed to update to the latest version. Too many people have done exactly that. It is your responsibility to know how to install and update software on your own Web site.
The next thing to do is to determine why Bad Behavior blocked you. Bad Behavior will display a short message along with a technical support key and a link to “fix the problem yourself.” Make a note of the technical support key, and then click the link. You’ll be presented with more information on why the request was blocked and several suggestions on how to fix the problem.
If you’ve been blocked from a site, and you aren’t the site administrator, please contact that person first, as they will be able to access records on their web server which will be helpful in solving the problem. Be sure to provide them with the technical support key you received. (If you are trying to access a site from a corporate or government network, you may need to contact the network administrator for your company or government agency to resolve the problem.)
If you are the site administrator, and one of your users was blocked and has contacted you for help, you can go directly to the support page and look up their technical support key yourself. You can use either the 8-character key from your database entries, or the 16-character key shown to users, with or without hyphens. You’ll then see the page that would have been shown to that user.
But you should ensure that your user has already followed the suggestions given on the page. The support page is written with non-technical users in mind, and so those of you who really know what you’re doing probably won’t like it, but it’s been my experience that, excepting the occasional bug in Bad Behavior, almost every actual human being who sees the page is able to fix the problem themselves.
If you’re unable to fix the problem yourself, and you’re the site owner/administrator, get your IP address, or the user’s IP address, log in to your phpMyAdmin, and Search the wp_bad_behavior table for the IP address and the last half of the technical support key (without the hyphen). Export the records from phpMyAdmin in SQL format and send them to me. You do not need to zip them, but it’s OK if you do. Please do not export in any other format but SQL. If you send me a screenshot, a PDF, or even worse, an Excel file, I will curse your name until the end of days, and probably not respond.
Finally, if Bad Behavior has been valuable to you, please consider making a contribution to further Bad Behavior development.

Bad Behavior 2.0.8 has been released.
This version contains updates for various “false positive” reports and is recommended for all users.
Updated in this release (since 2.0.7):
- Verizon Wireless EV-DO users are no longer blocked.
- Blocked requests will be subject to a two-second delay before a response is sent. (See below.)
- Some blackhole lists previously used in Bad Behavior have been scaled back or removed.
- The address for the Bad Behavior Blackhole has been added. (See below.)
- Some new spambots have been identified and blocked.
In recent days spam attacks have been on the rise, with one especially obnoxious bot delivering requests so fast that some sites have been taken offline by them. While the requests aren’t especially numerous or resource-intensive, the most common software used by Web hosting providers is very inefficient at serving dynamic pages such as PHP-based Web sites. So even a moderate number of requests can take a whole server down, or lead the hosting provider to take the site down before the whole server goes down.
Bad Behavior now counters this by introducing a short two second delay to blocked requests, before the HTTP response is sent. Since most spambots wait for the response before going on to the next request, this should sufficiently slow down most of the overly aggressive spambots and give Web site operators some breathing room. While I would have liked to put in a delay of a minute or more, there remains the slight chance that an actual human being would be blocked, and they should be able to get a response back in a reasonable time.
With respect to realtime blackhole lists, all of the existing lists target e-mail spam, and since spambots who send link spam are almost always also sending e-mail spam through the same servers, these are a fairly effective means of blocking link spam. However, since they target e-mail spam, they also block legitimate users. The primary issue here is that while an IP address may be added to a blackhole list quickly, it is not removed quickly — or at all — once the spam stops. Thus, people with dynamic IP addresses are unfairly blocked because some other customer was sending spam.
Bad Behavior Blackhole, which should go online within the next few weeks, is designed specifically for link spam. It adds IP addresses to its database quickly when actual spam is received, and in addition, drops the IP addresses once the spam stops. This helps prevent dynamic IP customers from being blocked because another user’s computer was sending spam. Once Bad Behavior Blackhole is online, all other realtime blackhole lists will be dropped from Bad Behavior.
Download Bad Behavior now!
As always, if you find Bad Behavior valuable, please consider making a financial contribution. I develop Bad Behavior in my spare time, and every little bit means I have more spare time to devote to its development.
And don’t forget to subscribe to the RSS feed or the mailing list. (They’re the same content.)
A user wrote in to let me know that Bad Behavior 2 has finally been ported to Drupal.
The work is pretty early and needs some spit and polish, but you can get the early results from the Drupal site.