Comment Spam: I'm sick of it - Drupal Version
I once wrote about how captchas helped me to get rid of WordPress comment spam. After the last - unfortunately quite successfull - spam attack on my now Drupal-powered web site, I decided to implement the same approach. Simple spoken, captchas are a kind of challenges that should ensure a real human is submitting a form, not a machine. Here are my experiences.
First, to enable Captchas you need the captcha module. Don't get fooled by the documentation stating it requires textimage module. It doesn't. Instead it serves simple, text based captchas. I personally prefer text based captchas over image based, since they are a lot less error prone, and they are accessible to (color)blind people, too. Additionally, as I will explain later on, image based captchas can as easily be bypassed as text based ones, so they don't provide additional security.
If properly configured, the captcha module asks a simple math question like "what is 2 + 8". Each operand is between 1 and 10, so the result is between 2 and 20, with a higher posibility for values near 10. One should think, this won't pose a challenge to a spam bot, hence it does. From my experience, comment spam was dramatically reduced: Only one spam bot managed to bypass the catcha, posting around 1 comment spam a day.
I wrote my own captcha challenges (you can see it in action on the comment form below) to test if this would stop the last robot, but it didn't. This brings up an interesting question: How do robots solve challenges?
Well, first of all, there usually is no intelligence in a robot's strategy. That is they don't try to solve the riddle by trying to parse and understand the question, but usually delegate it to a human beeing, like this: If you - as a spam bot - find a captcha, that is in this case: a text field named "edit[captcha_response]", read the label's text and carry it over to a special server. Wait until the server returns an answer to the challange. Post the answer. The server now will use the question to pose a captcha itself to the next visitor of a highly frequented site, for example a free porn site.
Image based captchas can be bypassed the same way - if the image cannot be simply read using Optical Character Recognition, better known as OCR (Yes, the same way your scanner reads documents). This is not rocket science, actually it's also available as an open source library.
Back to topic. To catch the last robot bypassing your filter and even prevent paid human beeings from hacking spam into your premium site's comment sections (who knows), you should install the spam module. It contains a links treshold filter that will fit another 98% of the comments coming through. The rest will be catched by spam module's learning filter.
Note the spam module makes use of a comment's links section. So this should be outputted in your comment template file, at least for comment administrators. I myself commented this out a while ago and it took me months to figure out how to mark a comment as spam...
Another idea is to change the name of the captcha field from "edit[captcha_response]" to - say - "edit[i_love_squirrels]". I didn't test this, though.


I really enjoyed reading this article. I myself have often thought about similar ideas to stop these annoying robots, so thank you for writing your ideas!
I got spam messages even with enabled captchas :(
So now all new posts in my blog are waiting for my approve.
I got spam messages even with enabled captchas.
So did I?