[linux-audio-user] Fwd: [Fwd: Wiki Spam Report]

Ruth A. Kramer rhkramer at fast.net
Thu Dec 9 22:09:19 EST 2004


Hans Fugal wrote:
> There have been some differing opinions on whether a wiki will attract
> spam and what to do about it. Here's a message about what the
> RubyGarden wiki has experienced and done. Some of you may be familiar
> with Ruby, and know that it is an extremely cool language but not
> (yet) as popular as other languages like perl, python, Java, etc. If
> you haven't heard of it, well that just attests to its not being a
> major player in the language market (yet). Yet they struggle with wiki
> spam.

Hans,

Thanks!  Very interesting approach!

I'd like to find out how much time Jim spends dealing with the tarpit. 
I may write to him someday, unless you find it convenient to do so.

regards,
Randy Kramer

PS: Unless his efforts take zero time, I'd rather wait till a spam
problem exists on WikiLearn before implementing such an approach.  In
the meantime, WikiLearn has, for example, the registration requirement.

> 
> ---------- Forwarded message ----------
> From: "Jim Weirich" <jim at weirichhouse.org>
> To:  comp.lang.ruby
> Date: Tue, 14 Dec 2004 03:21:02 +0900
> Subject: Wiki Spam Report
> Wiki Spam Report
> ----------------
> 
> I thought I would take some time and report on the wiki spam situation
> on RubyGarden.  As I hope you have noticed, the wiki has been
> remarkably spam free.  This email will tell you what measures we have
> taken to get to this point.
> 
> But first ...
> 
> Some Numbers
> ------------
> 
> Over the past 10 days, we have had:
> 
>   93 updates to the wiki page, all (AFAICT) spam free.
>      (although I might have missed spotting some).
> 
>   46 updates to the wiki tarpit.  Of those, we had ...
>      3 innocent updates
>      2 questionable updates
>      1 update by me
>     40 spams
> 
> The Mechanism
> -------------
> 
> Spammers are automatically routed to a wiki tarpit.  The tarpit is an
> (almost) exact copy of the real RubyGarden wiki.  Making changes to
> the tarpit looks as if you are making changes to the real wiki.  And
> since spammers get their pages from the wiki, it looks like (to them)
> that they have successfully spammed our site.
> 
> However, everyone else never gets to see the spam.
> 
> By tricking the spammers into thinking they are successful, they don't
> put any additional effort into bypassing our spam detection criteria.
> This is important!  When we explicitly denied them access to the wiki,
> then went to great lengths to figure out how to get around the
> restrictions.  I haven't seen any of that kind of probing with the
> tarpit.
> 
> Detecting Spammers
> ------------------
> 
> The current spammer detection logic is based on two observations:
> 
> (1) Spammers almost never use an IP address that has reverse lookup
> enabled.  This effectively means that it appears (to the wiki
> software) that your host name looks like a numeric IP address.
> 
> (2) Spammers almost never set user preferences on the wiki.
> 
> So if both of these conditions are true, we treat the access as a spammer
> and send it to the tarpit.
> 
> Now this isn't perfect, but that's OK.  We also have a explicit ban
> list for spammers who pass one of (1) or (2) above.  And we have an
> explicit allow list that overrides the automatic spammer detection.
> 
> Innocent Users
> --------------
> 
> Can innocent users get trapped by the Tapit?  The short answer is yes.
> However, we are monitoring the tarpit and will attempt to rescue such
> users.
> 
> In the past 10 days, there were at least 3 page updates that were from
> innocent users.  One guy (bless his heart) even removed some spam from
> the tarpit for us.
> 
> When I see innocents trapped in the tarpit, I add their IP address to
> the allow list and manually update the wiki with their changes (if
> they are significant).
> 
> Detecting the Tarpit?
> ---------------------
> 
> The tarpit is deliberately designed to look like the original wiki, so
> it is sometimes difficult to tell when you are trapped.  Here's some
> suggestions.
> 
> You are probably in the Tarpit when:
> 
> * there are a lot of recent updates made with numeric IP addresses
>   rather than host names.
> 
> * a lot of the pages have spam.
> 
> Although neither of these suggestions are foolproof.  I refresh the tarpit
> from the real wiki occasionally (to keep it looking realistic).
> Immediately after a refresh it is /very/ difficult to tell the difference.
> 
> If you think you are trapped by the tarpit, send me
> (jim at weirichhouse.org) an email with your IP address and I will check
> the logs.  If you are trapped, we can add your IP address to the allow
> list.
> 
> If you are worried about getting caught in the tarpit, just make sure you
> have your user preferences set when accessing the tarpit (click on the
> preferences link from any wiki page).
> 
> Summary
> -------
> 
> I am pretty happy with the current wiki situation.  In fact, the
> tarpit has been so successful, that I am considering lifting the ban
> on lower case http.  The ban currently isn't buying us any benefits
> and is rather annoying (I'll make it so both upper and lower case
> work).
> 
> Thanks for your time.
> 
> --
> -- Jim Weirich     jim at weirichhouse.org    http://onestepback.org
> -----------------------------------------------------------------
> "Beware of bugs in the above code; I have only proved it correct,
> not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)



More information about the Linux-audio-user mailing list