Bad Behavior - Spam Eliminator
As I mentioned previously in You’ve Got Spam, it was not long after I started using WordPress before I got hit by spam in my comments and trackbacks. I did a couple things to help prevent comment spam but what stopped the spam cold was Bad Behavior.
Bad Behavior is a set of PHP scripts which prevents spambots from accessing your site by analyzing their actual HTTP requests and comparing them to profiles from known spambots. It goes far beyond User-Agent and Referrer, however.
The problem:
Spammers run automated scripts which read everything on your web site, harvest email addresses, and if you have a blog, forum or wiki, will post spam directly to your site. They also put false referrers in your server log trying to get their links posted through your stats page.
As the operator of a Web site, this can cause you several problems. First, the spammers are wasting your bandwidth, which you may well be paying for. Second, they are posting comments to any form they can find, filling your web site with unwanted (and unpaid!) ads for their products. Last but not least, they harvest any email addresses they can find and sell those to other spammers, who fill your inbox with more unwanted ads.
To date, most anti-spam solutions for WordPress have focused on analyzing the content of comments as they are being posted. Little attention has been paid to analyzing the spambots themselves. And most anti-spambot solutions in general have focused on the User-Agent and Referrer fields in the web requests made by these bots.
While these pick off some of the spammers, the more sophisticated ones fake both the User-Agent and Referrer and make such checks mostly useless, unless the spammer makes a typo in the User-Agent.
The Bad Behavior Solution:
Bad Behavior was designed and built by watching actual spambots which harvested email addresses, posted comment spam, and used fake referrers. By logging their entire HTTP requests and comparing them to HTTP requests of legitimate users, it is possible to detect most spambots. Bad Behavior blocks spambots with a 412 error. It also has three configurable User-Agent lists for spambots and other malicious bots which actually identify themselves. Bad Behavior can use string matching or regular expression matching against a User-Agent.
Bad Behavior also will target bots which fail to obey robots.txt. At this time some of these bots are banned by User-Agent, though in the future Bad Behavior will detect them automatically.
Bad Behavior intends to target any malicious software directed at a Web site, whether it be a spambot, ill-designed search engine bot, or system crackers. In that spirit, it is not limited to WordPress users; a generic interface has been provided whereby it can be integrated into virtually any PHP-based software







Hey,
Cheers for the article, just implemented bad behaviour myself on my custom blog system. looks to be an excellent script so far!
Comment by Barry — June 6, 2006 @ 9:03 am