[Wikidata] Languages in Wikidata4Wiktionary

6 Apr 2017


      The current spec of the data model states that an L-Item has a lemma, a
language, and several forms, and the forms in turn have representations.
https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/Data_Model
The language is a Q-Item, the lemma and the representations are
Multilingual Texts. Multilingual texts are sets of pairs of strings and
UserLanguageCodes.
My question is about the relation between representing a language as a
Q-Item and as a UserLanguageCode.
A previous proposal treated lemmas and representations as raw strings, with
the language pointing to the Q-Item being the only language information.
This now is gone, and the lemma and representation carry their own language
information.
How do they interact? The language set referencable through Q-Items is much
larger than the set of languages with a UserLanguageCode, and indeed, the
intention was to allow for every language to be representable in Wikidata,
not only those with a UserLanguageCode.
I sense quite a problem here.
I see two possible ways to resolve this:
- return to the original model and use strings instead of Multilingual
texts (with all the negative implications for variants)
- use Q-Items instead of UserLanguageCodes for Multilingual texts (which
would be quite a migration)
I don't think restricting Wiktionary4Wikidata support to the list of
languages with a UserLanguageCode is a viable solution, which would happen
if we implement the data model as currently suggested, if I understand it
correctly.
Cheers,
Denny

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[Wikidata] Languages in Wikidata4Wiktionary