[Foundation-l] Extensive Link Errors related to Proper Names - Needs Fixing - Wikimedia-l

1 Apr 2007


      I have been compiling a machine compiled lexicon created from link and 
disambiguation pages from the XML dumps.  Oddly, the associations 
contained in [[ARTICLE_NAME | NAME]] form a comprehesive "real time" 
thesauraus of common associations used by current English Speakers in 
Wikipedia, and perhaps comprise the worlds largest and most comprehesive 
Thesaurus on the planet emedded within the mesh of these links within 
the dumps.
While going through the dumps and constructing associative link maps of 
all these expressions, I have noticed a serious issue with embdded 
linking with proper names.  It appears there may be a robot running 
somewhere that is associating Proper Names listed in articles about 
relationships between people
by linking blindly to any entry in Wikipedia that matches a name in an 
article.
Some of the content may create controversy to post examples here, so I 
will complete the thesaurus compilation, and folks should go through the 
encyclopedia.  Articles about movies stars and other "gossipy" type 
articles seem to have the highest errors linking proper names to 
unrelated people without proper disambiguation pages.  It could be 
interpreted as violations of WP:BLP and some of the error linkages could 
be troublesome for the foundation.
Whomever is running bots that link between articles should look at 
proper name links based on categories and check into this.  I found a 
large number of these types of errors.  They are subtle, but will most 
probably show up when browsing through articles unless you can analyze 
the link targets and relationships in the dumps.
Jeff