Re: [Wikitech-l] AntiSpoof issues

12 Nov 2006


      On 11/12/06, Tim Starling tstarling@wikimedia.org wrote:
[snip]
...
The problem is that merging sets is fairly fundamental to the way AntiSpoof
works -- i.e. by calculating a canonical representation of the username,
storing it and indexing it.
[snip]
Two pass:
Use the current high compression function to locate candate matches
nice and quickly from a non-unique index.
Then take the real potential match names and compare them directly
using a more intelligent comparison. (i.e. 'n'!='H').
The compression function could be made more lossy so that it will
identify a large but not unreasonable number of potentials.
We could even assign points to varrious kinds of matches and deny past
a threshold. This would also make it easier to support bi/trigram
triggers such as  cI ~= d .. which perhaps get more interesting when
we consider the entire UTF-8 charset.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] AntiSpoof issues