[WikiEN-l] Serious problems with interlanguage links

Lukasz Bolikowski L.Bolikowski at icm.edu.pl
Fri Dec 12 14:03:33 UTC 2008


Gregory Maxwell wrote:
> It would be interesting to re-run the analysis including only linkages
> which among the largest few Wikipedia and resolve those first: Those
> really should be much closer to the ideal "x==y" behavior.

Hi, I've rerun the analysis as you proposed.  I've taken all the
articles from the 10 largest editions (de, en, es, fr, it, ja, nl, pl,
pt, ru) and the interlanguage links between them.  I was looking for
incoherent components, as defined in my previous posts.

In this setting, there are 44245 incoherent components, containing, in
total, 436529 articles from the 10 editions.  Which shows that this is
not only a problem of linking to/from small wikis.  Also, trying the
engine at:
  http://wikitools.icm.edu.pl/
you'll see that the differences between the ontologies of the large and
the small wikis are not the only issue (as Eugene suggested).

Let me rephrase my concerns: on one hand, the policies state that
interlanguage links represent equivalence: Meta says they connect
"corresponding" articles, the English edition says they connect articles
"on the same subject".  There are third-party projects which assume this
(I've given two examples before), not to mention an army of bots.

On the other hand, editors don't respect that strict interpretation
since they want to show (valuable) relations between non-equivalent
articles.  Without seeing the "big picture", any such inexact link seems
OK: what could possibly go wrong?  And the global view does not seem to
be commonly known...

I don't have a ready solution, although, as I've written before, we
could take a closer look at the way OmegaWiki is dealing with the issue
(in my perception, the project's existence is motivated solely by the
existence of the issue in question), and the potential offered by the
SemanticMediaWiki extension.

My main goal is to convince the community (or be convinced otherwise)
that this is a serious, growing problem, which requires attention,
and stimulate a discussion which might lead to a reasonable solution.

Regards,
Łukasz




PS. An example: the following English articles are mutually accessible
using only the interlanguage links between the top 10 editions (assuming
that a link A -> B makes A accessible from B, which doesn't necessarily
match users' experience, but bots and harvesters "see" it):
 Administration
 Administration (business)
 Administrator
 Aktiebolag
 Aktiengesellschaft
 Aktieselskab
 Apostolic Administrator
 Besloten Vennootschap
 Brother (disambiguation)
 Brotherhood
 Brotherhood (album)
 Business
 Businessperson
 Compagnons du Tour de France
 Companies law
 Company
 Company (disambiguation)
 Contract
 Corporate law
 Corporation
 Corporation (university)
 Entrepreneur
 Entrepreneurship
 Fraternities and sororities
 Fraternity
 Fraternity (disambiguation)
 General partnership
 German Student Corps
 Gesellschaft mit beschränkter Haftung
 Government-owned corporation
 Guild
 Hermano
 Hermano (band)
 Incorporation (business)
 Joint stock company
 Journeyman
 Junior Chamber International
 Kabushiki kaisha
 Legal name (business)
 Limited company
 Limited liability company
 List of general fraternities
 Management
 Management science
 Maszoperia
 Naamloze Vennootschap
 Public company
 Public limited company
 S.A. (corporation)
 Sibling
 Sister (disambiguation)
 Société à responsabilité limitée
 Société par actions simplifiée
 Society
 Society (disambiguation)
 Sole proprietorship
 Studentenverbindung
 Student society
 Trade name
 Types of business entity
 Yugen kaisha



More information about the WikiEN-l mailing list