New comment spam deterrent implemented

DMZ · January 18, 2005 at 2:12 pm · Filed Under Site information 

You may not have seen it, but we’ve been getting slammed by a new wave of difficult-to-defeat comment spammers. They’re leaving innocous comments and then linking in the text and using the URL field to point somewhere that offers some… product that has to be spammed, I guess. They’re coming from many different IPs, probably on end user hacked PCs, so it’s been hard to stop with filtering or my secret anti-spamminator technology, which relies on knowing that.

So anyway, I’ve implemented a tweak courtesy of one Dave Pease that should entirely knock out automated comment spammers and huuuuuuuuuuugely reduce the amount of time we spend keeping the comments clean.

Bad part is, it requires an extra keystroke/mouse move for every comment submission. Compared to the burden of registration, it’s no big deal, but I wanted to point it out.

Also, I know we’ve been having sporadic site outages — can’t connect, database didn’t serve a page up. Apologies, as always. I’m still working on finding better hosting.

Anyway, back to the grind.

[update: I want to also say that if this proves effective, I plan to scale back some of the back-end restrictions that result in moderation for many AOL users]

Comments

30 Responses to “New comment spam deterrent implemented”

  1. Roger on January 18th, 2005 2:22 pm

    Cool. I don’t get spammers…there must be herds of idiots out there, buying this junk, or no one would do it.

  2. Evan on January 18th, 2005 2:39 pm

    I love Pease’s trick. It’s incredibly clever.

  3. Jagermeister on January 18th, 2005 2:51 pm

    Buy my product!!

    Just kidding, thanks for all the work you guys put into this site, it’s a great, great site.

  4. devil's advocate on January 18th, 2005 2:59 pm

    That’ll work for bots that are simply trying to spam WordPress sites, but I fail to see how that will block comment spam that’s specifically directed at this site.

    I wonder if you’ve considered putting in one of these things:
    http://www.captcha.net

    I think they’re fairly common, for a good example check out the Pandagon blog (warning: extreme liberal bias ahead!) at http://www.pandagon.net. IMHO these things are the best spam-blocking tool short of requiring registration. There are PHP implementations out there.

  5. Tim on January 18th, 2005 3:05 pm

    My only complaint is not posting this before implementing the filter (but nobody’s perfect). I didn’t notice the change and when I submitted my great logic on why Ron Villone sucks, my comment was lost in cyberspace forever. After losing my masterpiece, I was unable to rewrite my post. But thanks for your work.

  6. devil's advocate on January 18th, 2005 3:12 pm

    While this will most likely stop bots that are trying to spam WordPress sites, I fail to see how it’ll stop spam that is specifically directed at your site. Maybe that doesn’t exist, however; I really wouldn’t know.

    Anyway, wondering if you’ve checked out the type of image-based word verification that they use at Ticketmaster, among many other places. The technical term for it is a CAPTCHA – Completely Automated Public Turing test telling Computers and Humans Apart. I know there are PHP implementations. IMHO this is the best way to stop spam dead, short of requiring registration.

  7. DMZ on January 18th, 2005 3:14 pm

    We’ll see how this works out before I implement anything that’s harsher. At that point I’d almost rather require registration for comments, so people could at least save their info and not be too put out.

  8. DMZ on January 18th, 2005 3:39 pm

    And Tim, I’m sorry.

  9. David J Corcoran on January 18th, 2005 3:44 pm

    Registration really isn’t a bad thing if you can get cookies and the lot set up. A onetime login per computer would be great (unless you are me and are paranoid and delete your cookies everyday). But this should altogether stop computer generated link spammers.

  10. David J Corcoran on January 18th, 2005 4:14 pm

    Hey, DMZ, a buddy of mine on another forum thinks he was banned, read the post. He posted as “kearly” on your board:

    ————-
    This post is actually a question of sorts for the handful of posters here that also frequent the ussmariner.com blog (i.e. Corco [me]).

    “Recently, a new “anti-spammer” rule was implemented, meaning certain IP addresses were banned for making too many posts, or posts that posted links to other sites, or posts that didn’t contain relevance discussion.

    I came to find out today that my IP address has been marked as spammer. The thing is though, I’ve only posted on that site maybe 8 times all-time, and I’ve never posted links. On two occasions, I think it was “dave” disagreed with me strongly over a few of my opinions, and in fact on my very first post he threatened to ban me for suggesting a potential Howard (phillies) for spiez and Winn trade, which was actually a very hot rumor at the time.

    Of course, I thought he was just kidding.”
    —-
    Was he banned or did he forget to use the new drop-down box?

  11. Tim on January 18th, 2005 4:25 pm

    DMZ,

    No worries, if its not worth rewriting, it probably wasn’t that insightful to begin with. Thanks for insights and maintaining the site.

  12. DMZ on January 18th, 2005 4:29 pm

    Generally speaking, we don’t comment on specific cases of whether they are or not. That said —

    Recently, a new “anti-spammer” rule was implemented, meaning certain IP addresses were banned for making too many posts, or posts that posted links to other sites, or posts that didn’t contain relevance discussion.

    This isn’t true. There are moderation queues that do cover certain IP ranges, which I’ve acknowledged before — AOL, for instance, Bay area ISPs when we had Ichiro! trolls in here — but even in those cases, they go into a moderation queue before they go live.

    I came to find out today that my IP address has been marked as spammer.

    I wonder how he figured that out, exactly, since generally that’s not something I would comment on.

    So in short– I don’t know what he’s talking about, and I wish people would email us rather than go about making crazy claims about what we’re doing.

  13. Evan on January 18th, 2005 4:41 pm

    That message you get when you forget to flip the little switch does call you a spammer, so maybe that’s it.

  14. Paul Molitor Cocktail on January 18th, 2005 4:53 pm

    Pretty clever, though I like the image solution myself. I’m not sure if it has been defeated yet.

  15. devil's advocate on January 18th, 2005 5:30 pm

    i’ve tried some character-recognition programs and they can be very good at recognizing even highly corrupted text, and things like background color and grid lines won’t deter them. the one on Pandagon, for instance, is probably very defeatable by sophisticated programs. but i think the reality is that good OCR programs not the kind of thing that gets put into a spam-bot.

  16. Will on January 18th, 2005 7:10 pm

    Ever since Billy Beane wrote _Moneyball_ you just can’t find a good baseball blog out there that stresses the little things that really make you a champion. I’m talking about clubhouse character, stolen bases and productive outs.

    For more, please check out my book, availible at amazon.com and other sites, titled, _Baseball For Dummies_.

    Sincerely,

    Joe Morgan

  17. David J Corcoran on January 18th, 2005 7:28 pm

    I apologize for the lack of substance, but Re 16:

    “Ever since Billy Beane wrote _Moneyball_”

    LOL. Not many humourous moments on this board. That is a laugher, though!

  18. Will on January 18th, 2005 7:40 pm

    I try…

    I wonder how much spammers get paid??

  19. Matt Williams on January 18th, 2005 8:10 pm

    What sucks about spam is it only takes a couple idiots in a million to make it profitable. Especially when you consider that you can sell the email addresses of anyone silly enough to reply to the “unsubscribe” link.

  20. Dave on January 18th, 2005 8:24 pm

    devil’s advocate, the way wordpress (or movabletype, or … etc) comment spam works is that as the blogging tool becomes popular, the same shining example of humanity out there who is writing mass-mailer tools that randomly create from names like Ophelius Q. Snapdragon writes a tool that can post comments to a default installation of a blog. Then, spammers just feed their comment spamming tool the URLs of blogs that allow comments, and many comment spams are automatically posted on that blog.

    Unlike the CAPTCHA systems that some sites use, it’d be easy to program a fix for the challenge Derek implemented here today in one of those blog spamming tools. But unless this method becomes widely distributed, nobody who writes those tools will bother (or, to be honest, is even likely to know about it). I’m not planning on distributing the fix to anyone who doesn’t have a blog I really like, which means it won’t get very far past the USS Mariner boys.

    As far as I know, there’s no widespread problem with comment spammers that actually visit the sites they are spamming in their browser and enter comments manually. That’d be like an email spammer who crafts and sends a unique email by hand to everyone on their list of people who clicked ‘unsubscribe’ to a previous spam–with the presumably pathetic rate of return on spam advertising as it is, there’d just be no money in doing it without a significant amount of automation.

  21. Josh B on January 19th, 2005 3:51 am

    I don’t post here often, but I would assume for the more frequent posters that typing in a string of numbers everytime you post would be quite bothersome. Registration w/ typing a string of numbers once would probably be much better if this method eventually gets defeated.

  22. GWO on January 19th, 2005 5:56 am

    As an extra anti-comment-spam device, Google are amending their service to reduce the effacacy of comment spam in increasing Google page rankings. See this pagefor more details on this.

  23. David J Corcoran on January 19th, 2005 6:48 am

    Gosh, I love to have the time to write mass-spam programs for individual blogs.

  24. David J Corcoran on January 19th, 2005 6:48 am

    I *would* love

  25. Shoeless Jose on January 19th, 2005 12:09 pm

    Of all the forums where I regularly read or respond to threads this is the only one where comment spamming comes up as a regular problem. And — not coincidentally — this is the only one that doesn’t require registration. I have never seen a forum (or anything other than ticket-buying sites and Network Solutions) use the captcha system, and I doubt it would last very long if tried: it is just too annoying. Even if you choose not to retain the registration from session to session (or users choose not to retain cookies), requiring registration to post is not so onerous. It’s not like removes a layer of annonymity (since IP address for posters is already available) and as a bonus, because identity can be verified from post to post, it makes editing your own comments possible. This should actually improve the quality of the posts and make exta “one line to clarify the previous post” comments (like DJC’s above) unnecessary.

  26. DMZ on January 19th, 2005 12:27 pm

    Of all the forums where I regularly read or respond to threads this is the only one where comment spamming comes up as a regular problem. And – not coincidentally – this is the only one that doesn’t require registration.

    I don’t disagree that it might be true that registration prevents comment spamming, but your experience with a limited set of sites doesn’t really prove causality either way. We’ll see how this works out.

  27. dw on January 19th, 2005 1:00 pm

    You should consider jumping on the nofollow bandwagon.

    http://wordpress.org/support/topic.php?id=21187

    It won’t solve comment spam or even slow it down in the short term, but it will remove the incentive to use comment spam to bolster search engine rankings.

  28. Oscar Gamble's Afro on January 19th, 2005 4:32 pm

    Too bad there wasn’t a way to create a “Do Not Sign Aaron Sele” filter. Yeesh:

    http://seattletimes.nwsource.com/html/sports/2002154919_websele19.html

  29. roger tang on January 20th, 2005 1:27 pm

    Heh. Cute.

  30. academia informatica on February 13th, 2005 7:14 am

    Three rules for the spam game:

    1) you can not win.
    2) you can not draw.
    3) you can not leave the play.

    Greetings,

    Antonio, from Malaga (Spain)