[WikiEN-l] Serious problems with interlanguage links
Lukasz Bolikowski
L.Bolikowski at icm.edu.pl
Fri Dec 12 14:03:33 UTC 2008
Gregory Maxwell wrote:
> It would be interesting to re-run the analysis including only linkages
> which among the largest few Wikipedia and resolve those first: Those
> really should be much closer to the ideal "x==y" behavior.
Hi, I've rerun the analysis as you proposed. I've taken all the
articles from the 10 largest editions (de, en, es, fr, it, ja, nl, pl,
pt, ru) and the interlanguage links between them. I was looking for
incoherent components, as defined in my previous posts.
In this setting, there are 44245 incoherent components, containing, in
total, 436529 articles from the 10 editions. Which shows that this is
not only a problem of linking to/from small wikis. Also, trying the
engine at:
http://wikitools.icm.edu.pl/
you'll see that the differences between the ontologies of the large and
the small wikis are not the only issue (as Eugene suggested).
Let me rephrase my concerns: on one hand, the policies state that
interlanguage links represent equivalence: Meta says they connect
"corresponding" articles, the English edition says they connect articles
"on the same subject". There are third-party projects which assume this
(I've given two examples before), not to mention an army of bots.
On the other hand, editors don't respect that strict interpretation
since they want to show (valuable) relations between non-equivalent
articles. Without seeing the "big picture", any such inexact link seems
OK: what could possibly go wrong? And the global view does not seem to
be commonly known...
I don't have a ready solution, although, as I've written before, we
could take a closer look at the way OmegaWiki is dealing with the issue
(in my perception, the project's existence is motivated solely by the
existence of the issue in question), and the potential offered by the
SemanticMediaWiki extension.
My main goal is to convince the community (or be convinced otherwise)
that this is a serious, growing problem, which requires attention,
and stimulate a discussion which might lead to a reasonable solution.
Regards,
Łukasz
PS. An example: the following English articles are mutually accessible
using only the interlanguage links between the top 10 editions (assuming
that a link A -> B makes A accessible from B, which doesn't necessarily
match users' experience, but bots and harvesters "see" it):
Administration
Administration (business)
Administrator
Aktiebolag
Aktiengesellschaft
Aktieselskab
Apostolic Administrator
Besloten Vennootschap
Brother (disambiguation)
Brotherhood
Brotherhood (album)
Business
Businessperson
Compagnons du Tour de France
Companies law
Company
Company (disambiguation)
Contract
Corporate law
Corporation
Corporation (university)
Entrepreneur
Entrepreneurship
Fraternities and sororities
Fraternity
Fraternity (disambiguation)
General partnership
German Student Corps
Gesellschaft mit beschränkter Haftung
Government-owned corporation
Guild
Hermano
Hermano (band)
Incorporation (business)
Joint stock company
Journeyman
Junior Chamber International
Kabushiki kaisha
Legal name (business)
Limited company
Limited liability company
List of general fraternities
Management
Management science
Maszoperia
Naamloze Vennootschap
Public company
Public limited company
S.A. (corporation)
Sibling
Sister (disambiguation)
Société à responsabilité limitée
Société par actions simplifiée
Society
Society (disambiguation)
Sole proprietorship
Studentenverbindung
Student society
Trade name
Types of business entity
Yugen kaisha
More information about the WikiEN-l
mailing list