[Wikitech-l] Re: [Wikide-l] Löschung

19 Jul 2003


      Earlier, I wrote:
...
oder einfach [[Heilgen Römischen Reiches Deutscher Nation]] und ein
bisschen "fuzzy matching" in der Liste existierenden Artikeln?
[...]
In German, these different word endings are never (?) longer than the
last two characters of a word, which made me suggest the following
algorithm, from which I think the Swedish and Danish Wikipedia could
also benefit:
My suggested algorithm is far too simple and leads to too many false
positives.  I've tried it (on susning.nu) and disabled it again.
A working algorithm probably must recognize typical endings, which
makes it language specific (-er, -es in German; -a, -t in Swedish).
Existing "stemming" algorithms might be worth trying. I'm not going
any deeper into this.
-- 
  Lars Aronsson (lars@aronsson.se)
  Aronsson Datateknik - http://aronsson.se/

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: [Wikide-l] Löschung