[Wikimedia-l] Language links and double language links on the Wikipedias

Denny Vrandečić denny.vrandecic at wikimedia.de
Tue Jun 26 12:56:06 UTC 2012


I got the number from Brent Hecht, a researcher at Northwestern, who
has a number of great papers published on Wikipedia-related topics.

CC-ing him, so he knows I am blam.., er, referencing him :)

Cheers,
Denny



2012/6/26 Martijn Hoekstra <martijnhoekstra at gmail.com>:
> This number, 99.2% was also mentioned on the Berlin Hackathon. It
> sounds much higher than what my (very scientifically relevant,
> obviously) gut feeling tells me. Could you indicate where this number
> is coming from?
>
> On Tue, Jun 26, 2012 at 2:45 PM, Denny Vrandečić
> <denny.vrandecic at wikimedia.de> wrote:
>> Ziko,
>>
>> it does not jeopardize the Wikidata goal -- the current language link
>> system won't be switched off, but can be further used. Everything that
>> is working currently will still be possible afterwards. Wikidata can
>> still be used to represent the 99.2% of language links that are simple
>> -- this would still be a huge improvement over the current state.
>>
>> As soon as these are out of the way, we can think about if and how to
>> extend the system in order to deal with the rest.
>>
>> Cheers,
>> Denny
>>
>> 2012/6/25 Ziko van Dijk <vandijk at wmnederland.nl>:
>>> Hello,
>>>
>>> So may I guess that "double links" are usually the result of a
>>> Wikipedian who was not sure which language link to set, so in doubt,
>>> he simply put in the language links for two different articles?
>>>
>>> And in general, is it imagineable that different languages divide the
>>> knowledge in different ways, which could jeopardize the whole goal of
>>> Wikidata unifiying the language links?
>>>
>>> Kind regards
>>> Ziko
>>>
>>>
>>> 2012/6/25 Delirium <delirium at hackish.org>:
>>>> Thanks for this list. For the languages I know, I've started going through
>>>> and fixing ones that are clearly wrong. If a number of people do that, that
>>>> should improve the general quality/consistency of interwiki links. I second
>>>> the other comment that it'd be nice if the parsing could be re-run to
>>>> exclude commented-out links, but the list is still useful as is.
>>>>
>>>> There are some difficult cases, though, when languages make different
>>>> choices on how to group subjects, so the articles aren't actually in 1-to-1
>>>> correspondence. For example, the English article [[en: Móði and Magni]]
>>>> unsurprisingly has two outgoing interwiki links, when linking to languages
>>>> that split them, such as [[da:Magni]] and [[da:Modi]]. It's not clear what
>>>> to do about these cases.
>>>>
>>>> Best,
>>>> Mark
>>>>
>>>>
>>>> On 6/25/12 12:29 PM, Denny Vrandečić wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I ran some analysis last week, to get some numbers out of the
>>>>> Wikipedia language links. One type of reports that were generated was
>>>>> the list of all articles in the main namespaces of the Wikipedias that
>>>>> link to more than one article in another language edition of Wikipedia
>>>>> (so called double language links). There are not that many of them
>>>>> (about 19,000 in total), split by language, all available here:
>>>>>
>>>>> <http://simia.net/languagelinks/>
>>>>>
>>>>> Double language links are not errors per se, but they contain a few
>>>>> nuisances
>>>>> * they lead to two links in the language links list that just look the
>>>>> same (you have to hover over them to see that they link to different
>>>>> languages), which is not really optimal from the user experience side
>>>>> * they are not saved in the langlinks table and thus are ignored in
>>>>> certain reports and also in the respective export
>>>>>
>>>>> I am not sure how to reach out to the respective Wikipedia
>>>>> communities, or if I should at all. Should I post to their respective
>>>>> version of the village pump? Remembering from the time I was active on
>>>>> the Croatian Wikipedia, I would have appreciated that list to check
>>>>> the entries. I reckoned the wikipedia-l list would be the right place,
>>>>> but that list looks rather dead.
>>>>>
>>>>> Cheers,
>>>>> Denny
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Wikimedia-l mailing list
>>>> Wikimedia-l at lists.wikimedia.org
>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>
>>>
>>>
>>> --
>>>
>>> -----------------------------------------------------------
>>> Vereniging Wikimedia Nederland
>>> dr. Ziko van Dijk, voorzitter
>>> http://wmnederland.nl/
>>>
>>> Wikimedia Nederland
>>> Postbus 167
>>> 3500 AD Utrecht
>>> -----------------------------------------------------------
>>>
>>> _______________________________________________
>>> Wikimedia-l mailing list
>>> Wikimedia-l at lists.wikimedia.org
>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>
>>
>>
>> --
>> Project director Wikidata
>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>>
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>>
>> _______________________________________________
>> Wikimedia-l mailing list
>> Wikimedia-l at lists.wikimedia.org
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>
> _______________________________________________
> Wikimedia-l mailing list
> Wikimedia-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l



-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.



More information about the Wikimedia-l mailing list