Saturday, October 18, 2003

Bayesian Spam Filtering for Movable Type

James Song has provided an elegant solution to the spam commenting problem, by adapting the SpamBayes engine to detect bogus comments. Song's solution seems rather more robust and scalable to me than the MT-Blacklist option, as blocking entire IP ranges and word-based filtering are rather blunt instruments for detecting undesirable comments; what if one were to bring up Nabokov's Lolita in the context of a discussion on literature?

An additional, unforeseen benefit of Song's filter would be to eliminate the sort of insult-riddled nonsense that passes for argument in the eyes of many of the less intelligent commentators on popular weblogs (for copious examples of which, see this Calpundit post, and look out for posts by an individual called "Adam in MA"). Some people need to learn that calling others "fucks", "idiots", and copiously using terms like "fucking", "shit", "ass" and the like are not acceptable in civilized company. The anonymity provided by the Internet tends to bring out the worst tendencies in a lot of people, and this plugin just might be the ticket for reining in some of these unhelpful tendencies - assuming, of course, that bloggers like Kevin Drum do actually wish to see the worst of their cheerleaders restrained.