[Wikidata-l] What is the point of labels? - Wikidata

5 Jun 2014

When I drafted the functional structure that is appearing on items [1],
Gerard pointed out that it is drifting into the lexical area. That made me
think that while useful to have lexical data as an independent item as we
discussed last year for Wiktionary, the current structure "q item <label>
string" doesn't seem to be compatible with that wish, or at least it would
be more difficult to maintain the same label twice. And it is not just one
label per item, there are many, and each one might have different lexical
properties.

For more efficiency, it seems that we would need statements like "q item
<label> lexical item" to reflect that separation, but that adds further
complexity, because according to the latest Wikidata:Wiktionary proposal
[2], the "lexical item" (W) also contains senses/meanings (S). This is
recurrent, as we already have Q items as the basis for meaning... or at
least a concept that is more or less shared among languages. The only
difference between "Q items" and the proposed "S items" is that S
items
represent only one of the lexeme meanings for one particular language, but
other than that they have the same nature as Q items (it should be possible
to add "subclass of" and other statements to them).

Labels, aliases, and name properties are just normal statements where one
of them is preferred, I have been wondering why don't we treat them as
such... That way we could have some coherence, and have both "Q items" and
"S items" as the units of meaning/sense and later on move the labels
(lexemes), which now are strings, to the lexical items ("W items" in the
example on the page Wikidata:Wiktionary).

Summing up, labels in their current form make complete sense now, but when
considered together with lexical information, it seems that it would be
convenient to treat all of them as statements that later on could link with
"W items". And as Joe pointed out, there are many more properties that are
equivalent to a label, just more specific, and that now don't show up in
the suggester, nor up above of the page where they should.

I know that Wiktionary is still in the future and that there are many other
priorities on the way, however since the representation of the items is
being re-considered, I think it is a good moment to think about how to move
little by little in the right direction. I also would like to point out
that by keeping lexical information in wikidata, its complexity is going to
increase inevitably. If new users already struggling to understand it now,
I cannot imagine how will they cope with added elements...

Micru

[1] http://lists.wikimedia.org/pipermail/wikidata-l/2014-June/003941.html
[2] https://www.wikidata.org/wiki/Wikidata:Wiktionary