The current spec of the data model states that an L-Item has a lemma, a language, and several forms, and the forms in turn have representations.https://www.mediawiki.org/wiki/Extension:WikibaseLexeme/ Data_Model The language is a Q-Item, the lemma and the representations are Multilingual Texts. Multilingual texts are sets of pairs of strings and UserLanguageCodes.My question is about the relation between representing a language as a Q-Item and as a UserLanguageCode.A previous proposal treated lemmas and representations as raw strings, with the language pointing to the Q-Item being the only language information. This now is gone, and the lemma and representation carry their own language information.How do they interact? The language set referencable through Q-Items is much larger than the set of languages with a UserLanguageCode, and indeed, the intention was to allow for every language to be representable in Wikidata, not only those with a UserLanguageCode.I sense quite a problem here.I see two possible ways to resolve this:- return to the original model and use strings instead of Multilingual texts (with all the negative implications for variants)- use Q-Items instead of UserLanguageCodes for Multilingual texts (which would be quite a migration)I don't think restricting Wiktionary4Wikidata support to the list of languages with a UserLanguageCode is a viable solution, which would happen if we implement the data model as currently suggested, if I understand it correctly.Cheers,Denny
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata