Am 24.10.2014 02:17, schrieb James Heald:
I think the issue I'm stuck on is: what property
would the qualifier be attached
to ?
...
The first choice might be attaching the information to
a "Creator" property.
I would prefer "Contributor", but yea, something like that.
But for the underlying works of these engravings,
there are typically *two*
creators, both of which are significant -- the artist, and the engraver.
You can have any number of Statements about a Property, and each of these
Statement has it's own set of Qualifiers (and Source References). E.g.
Contributor: Henry Foo
Point in time: 1872
Role: Engraver
Contributor: Melissa Bar
Point in time: 1870
Role: Illustrator
So instead, we might consider an "Underlying
work" property, analogous to the
"Work" class in the Multimedia API development, "a creation to which
copyright,
authorship, etc is attached", as per
https://docs.google.com/document/d/1tzwGtXRyK3o2ZEfc85RJ978znRdrf9EkqdJ0zVj…
But can we then capture the whole of the work class in such a property?
No. Using "Underlying work" (or, as I would prefer to call it "Derivative
of"),
the Work has to be modeled as an Entity in it's own right - either a Wikidata
Item or a MediaInfo entity.
There seem to me a couple of issues:
(1) What should be the value of the property? There doesn't seem to be an
obvious choice (eg if one were importing from a repository or catalogue). What
would be the datatype, and what should we store for this field.
It would be a reference to another Entity. Only the ID would be stored.
(2) It seems to me that we would need to enable
qualifiers on qualifiers -- for
example, if we represented the creator of an underlying engraving using a
qualifier, we would then seem to need another qualifier to indicate whether the
role was as artist, or as engraver.
See above: there is no need for this, since we can have any number of "top
level" Creator/Contributor entries.
In some cases, the contributor's role may be implicit by using a more specific
Property, like Painter, Director, etc.
Similarly, if there is sourcing, there are sources
that might apply to one (1st
level) qualifier, but not another. But normally the WD sourcing model is for a
whole statement, not part of it.
They would apply to one *Statement* but not the other:
Contributor: Henry Foo
...
Reference:
Title: Detailed Research On That Book
DOI: ...
Reference:
Title: My Art WEbsite
URL: ...
Contributor: Melissa Bar
...
Reference:
Title: Awesome Art Book
Author: R.N. Dewy
ISBN: ...
What we're would really be doing, if we did this
in full, would be in effect to
store the contents of what might otherwise be an entire item in a property.
If we have that much relevant information, it might be worth creating a data
item. Especially if we end up repeating that info for multiple files (e.g.
engravings from the same book).
This can and should be decided on a case by case basis. Just like on Wikipedia,
it makes sense to create a separate Article when a section of some more general
article grows too big.
That has some attractiveness, if at a future time one
wanted to promote the
'underlying work' to have a Wikidata item in its own right -- the two structures
would then match exactly.
But it would mean CommonsData having a slightly different data structure to
Wikidata.
Slightly different isn't a problem, but the ability to "nest" entities
and/or
qualifiers is a fundamental structural incompatibility. That's not good.
...
If we're looking to support these searches and
orderings, does it matter that a
particular field may sometimes be on the file item, but sometimes on a Wikidata
item ?
Searching (or rather: querying) across both datasets at once would be nice, but
that'S pretty far off. First, we need decent query capabilities for the
individual datasets.
I would imagine that for all files based on a specific book, the same approach
would be chosen (e.g. a Wikidata Item for the Book, and MediaInfo for each file).
Note that Queries are different from Searches. Searches are ranked and
potentially open-ended. Queries have a definite result set, and may be sorted.
Queries will (in the future) be pre-defined and cached, and can be used on wiki
pages via Lua, to create a list or table based on whatever logic you like. On
that level, it would also be possible to combine information from two
repositories (Wikidata and Commons), but at that point, we are talking about
proper programming in Lua.
Would it matter that for one of the engravings we have
two copies, so the
information that we would be wanting for search and selection and ordering would
be stored on a Wikidata item; whereas for the rest, with only a single copy, it
would be stored on a Commons item? )
It would be tricky to manage this nicely for the general case. For your specific
book, you may write some specialized Lua code that deals with this.
However, I would not recommend to create a data item just because you have two
files in a single case. If the relevant data is not too extensive, it's fine to
duplicate it.
None of these questions are without solutions. But it
does, I think, require a
decisive view to be reached, as to what we propose to do.
I think there are two main parts to your questions:
a) How to model contributions without modeling all the "base works" separately.
I think multiple Contributor statements with separate lists of qualifiers and
source references cover this.
b) How to best integrate the information that lies partially on Wikidata, and
partially on Commons. This is indeed tricky, and perhaps there is no general,
one-size-fits-all solution.
One thing that may help is the planned "high level media info API", which
provides license/attribution/legal information about files in a unified form,
drawing from structured data both on Commons and Wikidata.
--
Daniel Kinzler
Senior Software Developer
Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.