Thanks for the thoughts, these are super useful! And yes, I sent that mail yesterday out with not enough context.

Thad, thanks, LCID looks indeed very interesting! That could provide a source of numbers to draw from. But looking into it in detail, it also looks like "we took the English language names, and in that order assigned numbers, until we finished the first batch, and then added more batches chronologically". So at least there's precedent for doing that.

Lucas, I was also thinking about QIDs. But besides the point that David raised, or rather in addition to them, there's also the point that Wikidata should be the 'pure representation' of what a language is - whereas the objects in Wikifunctions representing languages should malleable to exactly what we need and want them to be, irrelevant of their independent ontological status in the world (I hope that makes sense?). In Wikifunctions we decide what a language is, and their fallbacks, etc., based on product needs. In Wikidata on the other side we decide what is a language and what is not based on what relevant sources are stating. These hopefully overlap, and ideally are equivalent, but in reality I don't expect them to be and I don't want to introduce a push for edits in Wikidata to make certain features in Wikifunctions work.

Yes, we should have mappings from the relevant ZIDs to QIDs, and/or the other way around, but that's why I think they shouldn't be the same.

Charles, regarding your point, yes, we should be compatible and map to external standards as much as possible, but for the same reason as to regarding Wikidata, we shouldn't simply import them wholesale.

I wrote down a few more thoughts on-wiki here:

https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Representation_of_languages 






On Thu, Mar 11, 2021 at 5:44 AM Charles Matthews via Abstract-Wikipedia <abstract-wikipedia@lists.wikimedia.org> wrote:

> On 11 March 2021 at 12:24 David Abián <davidabian@wikimedia.es> wrote:
>
>
> Hi,
>
> Although we usually claim that Wikidata entities and their URIs are
> stable, I believe that, unfortunately, their stability isn't as high as
> desirable in practice, or at least the risk of deletion, merging,
> redirection, redefinition of identity due to confusion or ambiguity,
> etc. exists and materializes too often (I hope I'm not being read by
> Wikidata haters :D). This happens with all kinds of Items, and it can
> happen with minority languages as well as with something as widespread
> as, for instance, Chinese.

> On 11/03/2021 09:50, Lucas Werkmeister wrote:
> > Wouldn’t it be better to refer to them using Wikidata item IDs?

Not an area where I'm an expert. But there seem to be quite a number of standard IDs for languages, include some which are ISO, and P424 on Wikidata which is "Wikimedia language code".

I would say the default solution would be Z-numbers, and a Wikidata property for WA ID so that cross-walking to any other standard ID is just a Wikidata query away. If what is intended is an exact fit with P424 then that would be duplication; but I suppose in the end that won't be the decision.

Charles

_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia