Blog Comment Spam

As I am running a wordpress blog [b4k4.ath.cx/wordpress/, I have recently been getting more and more comment spam. I turned on comment moderation for spam words and comments with more than 3 links (as all the comment spam I was getting seemed to have 4+). However, it still doesn't seem like a very good solution.

Since they're targetting the post form directly [wp-comments-post.php], renaming it could be a good idea. I don’t like it. How about setting authorization cookie/session variables in some prior page (like wherever the “post comment” links are located)? I very much doubt that spambots are going to bother to pass additional, randomly generated (md5?) arbitrary cookies or form variables along. Bots that don’t bother to read the original post would never get these pieces of data, and those that do would have to know they were important, or copy /all/ form data, which would be an annoying scrape at best.

We could use Captcha techniques, using those pictures with randomly generated characters, and make commenters type them in, thus forcing spammers to solve hard AI problems. There has been some wordpress-specific hacking to this effect. Gudlyf’s Wordpress Hack: AuthImage does exactly this. However, this is fairly bad from an accessability perspective. Is this a scenario where we screw over the blind web users? It also depends on an image rendering library like GD, which is a pain in the ass.

Churchtown has taken an interesting approach in the Wordpress Support Blog: Comment Spam:

1) in robots.txt disallow the normal wp-comments-post.php
2) change the name of the actual wp-comments-post.php
3) allow only REFERERs from my own site
4) include disable script in (honey trap) wp-comments-post.php

This sounds like a rather effective banning stratagy, though it hinges on several key points.

  1. bots accessing the normal wp-comments-post.php are in violation of the robots.txt rule, and thus deserve banning?
  2. renaming the wp-comments-post.php file still sounds like a poor solution, but it may be required
  3. I like the idea of only permitting requests with a proper referrer, but that header can be faked. I like the idea of using this in addition to another header.
  4. automagical banning w00t!

A combination of stratagies would likely provide the most successful solution.

Refactored from comments in A Bird’s Melody: Blog Spam.


About this entry