Mollom brings enhanced content protection

MollomMollom moved from a private beta to public beta today. I've been fortunate enough to participate in the private beta and can say this stuff rocks. There are plenty of methods of protecting sites from spam and bad content, but this is by far the best so far. The problem with many systems is they treat legitimate users as the enemy. At least with traditional CAPTCHA systems is that they challenge the user for an answer before the user has provided any information to suggest they should be challenged. Users pay the penalty for the bad behavior of the spam bots. There are systems that work around this and some like Akismet have done pretty well on this site. The problem, however, with many of these services is that they can still be gamed to a greater extent. And, since the spambots don't recognize that their attempts to add content have been unsuccessful they merrily pound away on the server. The other methods also generally call for administrators to monitor things pretty closely. With all the great spam tools there would still be a couple of spam posts a week that would slip through on this site. During the transition to Mollom a bot was actually attempt to post. In the few seconds the site was unprotected a couple of posts slipped through. In the weeks since nary a errant post has been made (aside from one on an article which should not have been configured to accept comments but that wasn't spam).

There are sites like GoDaddy that take this to the extreme with CAPTCHAs for logged in users all over the place. Granted they have problems with automated logins and people abusing services, but somewhere the madness has to stop and they need to come up with a better solution.

But let's take a look at the impressive results on another site. The site is pretty neglected and doesn't have a lot of traffic. But benign neglect made it a haven for spambots. They loved to post away. The graph below shows the spam posts since installing Mallom on the site.

graph of Mollom's effectiveness

So not only did Mollom successfully keep the spam off the site but the 'bots have started to leave the site alone. Quite impressive results. Will they be back, on high value sites they surely will. The notes on the public beta mention that some of the issues from the private beta have been fixed. Those issues sometimes caused me to feel like was out to get me asking for a CAPTCHA on every post I made.



Mollom is not just your

Mollom is not just your average spam-fighting service. It is based on a radically new approach that both improves its spam fighting precision over time and reduces the moderation effort needed to correct its mistakes. After analyzing your content, we not only return a 'spam' or 'ham' result, we also return 'unsure'. If Mollom cannot be 100% certain into which class to put your submitted content, we categorize it 'unsure' and a CAPTCHA challenge is shown on the content submission form to authenticate that the user is human.

Spam fighting tools compute a score based on words and links present in the content under investigation. This 'spaminess' score indicates how likely it is that a post is spam or not. Conventional spam fighting tools return a 'ham' result when it seems _likely_ that a post is ham rather than spam, given its spaminess score. This decision line is shown in the graph above. Here, the green line denotes known ham messages, while the red line denotes known spam messages. So if a message is analyzed, and its spaminess score is to the right of the decision boundary, it is considered to be ham.

What is the problem with this approach? Not all content is correctly classified. This may appear to be only a tiny fraction on the plot, but when millions of messages are being processed all the time, we are talking about 1,000's of misclassified messages every day. Some posts that are actually spam land on the right side, the ham side, of the decision boundary where they don't belong. This spam is not recognized by the system and is allowed onto your site. On the other hand, some legitimate messages fall into the spam bucket and will be blocked from your website. Neither of these are desirable outcomes. To counteract this, a conventional spam blocking system dumps all the messages in the spam category into a moderation queue. The site moderator has to periodically go through all of it to pick out the few ham messages misplaced among the spam. It is like looking for a needle in a haystack and not something anyone looks forward to doing.

Here is how Mollom solves these problems. Instead of two classes, we define three: 'spam', 'unsure' and 'ham'. Mollom returns 'spam' only if it is 100% sure that the post is spam and these posts are discarded. If Mollom is quite certain (more certain than using the old technique) that a post is ham, it is accepted. But what about the rest?

We define a gray zone, an area of uncertainty, and here is where the CAPTCHAs come in. When Mollom is unsure about a submission, the user is asked to respond to a CAPTCHA. If the response is correct, and thus the submitter is human, the content will be accepted. Otherwise the post will be rejected. But wait, people hate CAPTCHAs ... True, but as you can see on the graph above, only a tiny fraction of real human-submitted content falls into our 'unsure' zone and triggers a CAPTCHA (currently, only approximately 4% of human submissions). To the very largest extent, CAPTCHAs are not shown to humans at all, they are shown to the bots!

So, to sum up: (i) Mollom is more accurate because our ham boundary is shifted to the right on our graph (making it very strict), so significantly less spam can sneak in (we are now at 99.94% correctly classified ham messages), and (ii) the need for a moderation queue is gone, since the real human users perform the moderation themselves instead of site owners or moderators.

Submited by : Bajar Libros

Mollom is working great so far

I just added Mollom to one of my Drupal sites and so far i'm impressed. I've used Akismet and BadBehavior which together prevent a good deal of spam, but not all. After using Mollom for just over a week the site has remained 'pure ham' ...
Now if only we can stop the kiddies from posting stupid stuff, life would be perfect..