From: Jamie Thingelstad jamie@thingelstad.com To: Al Johnson alj62888@yahoo.com; MediaWiki announcements and site admin list mediawiki-l@lists.wikimedia.org Sent: Tuesday, May 28, 2013 10:14 AM Subject: Re: [MediaWiki-l] Wiki spam. Stronger fightback.
I can — Nemo was referencing some dialog we had while I was trying to figure out performance issues on my farm.
You can see the whole dialog, with graphs, here:
http://wikiapiary.com/wiki/WikiApiary_talk:Operations/2013/May#Banned_IP_che...
Actually, after rereading that Talk page I'll just leave it at the link. All the gruesome details are there. :-)
TL:DR; Using this stopforumspam method caused me 4-5x slowdown in performance.
I see. I'm curious how the lookup was implemented? Sounds like an in-memory search. Have you thought about using a bloom filter[1]? That would reduce your lookup time to sub-millisecond. I recently implemented one in both javascript and java and am quite happy with it. It holds up to 1 million entries in 2.7 MB with a false positive rate of under 0.1%.
[1] http://en.wikipedia.org/wiki/Bloom_filter
al
Jamie Thingelstad jamie@thingelstad.com mobile: 612-810-3699 find me on AIM Twitter Facebook LinkedIn
On May 28, 2013, at 10:48 AM, Al Johnson alj62888@yahoo.com wrote:
From: Federico Leva (Nemo) nemowiki@gmail.com To: mediawiki-l@lists.wikimedia.org Sent: Tuesday, May 28, 2013 2:47 AM Subject: Re: [MediaWiki-l] Wiki spam. Stronger fightback.
Al, the problem with stop forum spam is not memory but rather the CPU usage and lag it creates, for an apparently limited amount of spam blocked (at least on small wikis) https://www.mediawiki.org/wiki/Manual_talk:Combating_spam#CPU_usage.3B_IP_blacklists
Hi Nemo. You didn't say why you would not recommend that approach. Can you elaborate?
Nemo
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l