On 17/09/12 12:54, Jeroen De Dauw wrote:
Hey,
As explained in the text, the aliases are not distinguished from
other property values in the data model right now. This was the
status of the discussion when we last talked about this, but we can
also re-introduce aliases as a special field (I see why this would
be useful). Daniel had an argument against this, saying that many
other property values could also work as aliases in certain domains
(e.g. binomial names of biological species). So the special status
of the alias in the data model was questioned.
Right, that makes sense to implement at some point if there really is
demand for this. This is rather harder to implement then what we're
currently doing and is blocked by phase 2 stuff and probably phase 3
stuff, while we want to have it in phase 1 already.
A while back we also had a related discussion where Daniel took the
position that we should also not have special labels and descriptions.
The conclusion of that was that we will have them but that we will make
them accessible via the same interface as regular properties (at least
for read ops).
Ok, I agree with that. I will change the model to have explicit aliases
somewhere.
if two items have the same description, can one
of them use an alias
that is the title of the other?
Good question. Right now this is not enforced. Then again, right now
aliases are not used anywhere for lookups except in the fulltext search
thing, where this restriction is not really relevant. Denny, Daniel, any
thoughts on this?
This is also based on a preliminary decision made
a while back: the
idea was that properties, while not having Wikipedia articles,
will
still need unique string identifiers that can be used in wikitext (e.g.
queries) where one does not want to address properties by ID or by
"label+description" pairs.
This seems odd to me - you sure the term TitleRecord is being used
consistently through the data model and this thread? I'm using it as
"GlobalSiteId PageName".
Yes, this is what I mean. But PageName is just a string, and does not
need to refer to an actual page (or be displayed as a link). It can
still be used as a "string key" to refer to the property on a certain site.
I do agree you would probably not want to put label and description in
wikitext, and that just the label might or might not be sufficient, even
if they are unique per language. If you need an id that really is always
unique you can just use the p12345 thing. Since most of the editing of
these will happen via GUIs (right?) this seems to be quite acceptable.
Or does anybody see a better approach?
Well, the above. It allows you to assign a human-readable key to each
property that you can use instead of p12345 and that is still unique for
each site. Moreover, this can be done with code that is similar to what
we already have for site links in Items (but without linking and thus
also without auto completion).
In any case, why would you resort
to "GlobalSiteId PageName" rather then "label description"?
Because it is easier. First of all, "label description" is not enough:
you need to say which language you talk about to make it a key (this can
be guessed from the site, but this is still not a unique selection).
Second, you do not need to mention the GlobalSiteId if you are on a site
and want to use its own ID. So one addressing method requires one strong
key (PageName), the other requires three string keys (language, label,
description). The former seems easier.
What makes
it so odd is that the "GlobalSiteId PageName" is meant to indicate
equivalence of items across sites, which is rather different then using
it to identify properties in wikitext.
What you are saying ("equivalence across sites") only is another way to
say that "GlobalSiteId PageName" is a key for entities on Wikidata. Such
keys can always be used to define equivalence classes (of keys that
refer to the same thing); how is that a problem?
It seems that a property could at best have a
list of
PropertyValueSnaks (no auxiliary Snaks, no references, no statement rank).
Why not have a list of claims?
Do you think you need auxiliary Snaks there? The step from "list of
snaks" to "list of snaks with auxiliary snaks (i.e., claims)" is not
hard to make (even later), but I would not make it without a cause. In
general, what is the motivation of allowing arbitrary Snaks for
properties? Really general annotations (users define properties, e.g.,
to organise other properties) or merely technical information (some
properties need extra information about things like units of measurement)?
Markus