Hello Michael,
thank you for your input, this is extremely valuable.
In general I expect that Wikidata will serve your needs better than an extraction from Wikipedia could. First, yes, we will have more stable identifiers. Second, it should be better at identifying items of interest. Some of the reasons why several meanings are conflated into one article or spread over several articles in Wikipedia is that it simply makes sense for a text encyclopedia. I don't see a reason for Wikidata doing the same.
I do not expect Wikidata to solve all problems. In some glorious future, Wikidata will have a community. This community will decide on criteria for inclusion, both with regards to the coverage of items and with regards to what they are saying about them. The community will decide on the kind of sources they accept. Etc.
(Actually, "decide" is too nice a word for the process I expect will unfold... )
We will keep the problems you mentioned in mind, and I fully think that we will improve on every single one of them.
2012/7/3 Michael Smethurst michael.smethurst@bbc.co.uk:
So I think we'd be interested in wikidata for 2 (maybe 3) reasons:
- as a source of data for domains where there's no established (open)
authority (eg the equivalent of musicbrainz for films) 2. as a better, more stable source of identifiers to triangulate to other data sources
Yes, I expect that both use cases will be covered by Wikidata.
?3?. Possibly as a place to contribute of some of our data (eg we're donating our classical music data to musicbrainz; there may be data we have that would be useful to wikidata)
It will be up to the community to accept data donations -- the development team does not speak for the community. Personally I would be thrilled to see such donations happen. See also:
Have glanced quickly at the proposed wikidata uri scheme (http://meta.wikimedia.org/wiki/Wikidata/Notes/URI_scheme#Proposal_for_Wikid ata) and
<snip> http://{site}.wikidata.org/item/{Title} is a semi-persistent convenience URI for the item about the article Title on the selected site Semi-persistent refers to the fact that Wikipedia titles can change over time, although this happens rarely </snip> Not sure on the definition of infrequently but I know it's caused us problems.
Fully agree. But they make for nice looking URIs. The canonical URI though is the ID-based one, and these are stable. The pretty ones are for convenience only. I will take a look at the note to see if this needs to be made more explicit.
Wondering if the id in http://wikidata.org/id/Q%7Bid%7D is the wikipedia row ID (as used by dbpedialite)? Also wondering why there's a different set of URIs for machine-readable access rather than just using content negotiation?
No it is not. There is no such thing as the "wikipedia row ID", what you mean is the "page ID on the English Wikipedia". As there are plenty of items that have articles only in Wikipedia other than English, a reliance on the English Page ID would be problematic. We introduce new IDs for Wikidata, but we will provide mappings to page IDs in the different Wikipedia language editions.
Thank you again for your input, and I hope the answers help.
Cheers, Denny