Re: [Wikidata] Label gaps on Wikidata

27 Feb 2017

      Am 19.02.2017 um 17:00 schrieb Romaine Wiki:
...
Hi all,
If you look in the recent changes, most items have labels in English and those
are shown in the recent changes and elsewhere (so we know what the item is about
without opening first).
Wikidata actually tries to show you the labels in your üpreferred interface
language. And if you user language is not available, it uses a fallback
mechanism to show the next-best language, which may even include automated
transciptions. When all else fails, it will show the English label. If that
doesn't exist, it shows the ID.
...
But not all items have labels, and these items without
English label are often items with only a label in Chinese, Arabic, Cyrillic
script, Hebrew, etc. This forms a significant gap.
The fallback mechanism works OK, but is not great for English speaking users who
see a lot of items that have no English label. For English, we just don't know
what to fall back to. Just anything? Or try european languages first? What
should the rule be? If we can decide on a good rule, it should actualyl be
pretty simple to add such fallback for English.
...
Is there a way to easily make a transcription from one language to another?
We have such rules for some languages/variants, e.g. between the cyrillic and
the roman representations of Kazakh or Uzbek. But translitteration rules can be
complex, and covering every permutation of the 300 languages we support would
mean we'd need about 45000 rule sets...
...
Or alternatively if there is a database that has such transcriptions?
Not yet. One of the goals of Wikidata is to be that database.
-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] Label gaps on Wikidata