bopuc/weblog: Me on BlogSpam

October 23, 2003 17:52 | WebBlogging

Me on BlogSpam

I wrote two things today about Blog Comment Spam.

One was sent to Dave Farber's "Interesting People" mailing list (which he published! Yay!):

Hello Mr. Farber,
/.../, the issue of Weblog Comment Spam is
one I have been following, and fighting, with aplomb lately.

First of all I'd like to recommend, as initial relief, for Movable Type
users, two excellent "MT-Plugins". Yes, they require installation, but for
folks who have actually installed MT themsleves, it is a snap and the
authors have outdone themselves in offering easy-to-use yet powerful
solutions.

The Plugins are:
James Seng's "MT-Bayesian"
Jay Allen's "MT-Blacklist"

Both authors have spoken about published and distributed
blacklists/whitelists, which with the very organic nature of the blog
ecosystem, could be quite powerful. Imagine TypePad, Userland and Blogger
all performing Bayesian filtering and sharing, in real-time, their lists...

Also, as noted in this thread, spam is a reality we encounter in many forms
in our environment: "snail mail" spam, e-mail spam... Personally I regard
most all forms of advertising as spam, but I digress.

Point being, I observe all this as I observe any living ecosystem, and, as
opined, the vibrant and rapid evolution and growth of the "blogosphere"
allows for a terribly accute perception of the development of this entity
(spam, that is).

Introduce a new organism into a stable ecosystem and watch what happens. The
inhabitants of the ecosystem are forced to adapt, take action, or be preyed
upon and die.

In the email ecosystem, the confines and limitations of the environment are
such that spam seems to be winning. The blog environment has much more
flexibility and tools at it's (easy) disposal.

Just some thoughts. Thank you.

The second was a comment on Dav's weblog, where he introduced another take on the Turing test-style "read the characters in this randomly generated image and type them in to authenticate that you are a human being" spam-stemming techniques (a technique I myself had suggested here a few weeks back but which I now wholeheartedly reject.. thanks for setting me straight Karl!):

Hi Dav!
Yeah, essentially these kind of "Turing test" type deals are not the best way to go, eventhough they may seem so. As we all know it essentially closes the door on some people. To put it in a "high-level" way: we place the burden of the fight on folks who have nothing to do with it. Legit vsitors/commenters (which are the 99% majority) are neither the propagators (the spammers) or the victims (the blog "owners", us).

Since we cannot go to the source and fight the spammers themselves, it falls on us to deal with it.

Therefore, so far, and by far, the best solution is James Seng's Bayesian comment/ping filter. If we build this out in a distributed fashion and get the blog-makers to integrate it, it will be massively powerful and effective. Jay's Blacklist system is also good (both did magnificent jobs on their respective MT plugins, BTW), but is much more labor intensive, especially in the long run.

My 2 cents. ;)

(see, now if I were blind or drunk.. or blind drunk, I could not have posted this comment... ;)

Hehe. :D