Jane, Lydia and WikiDatans,
These are great and helpful developments, which seem to be quite far along now.
Jane and WikiDatans, can you point to similar helpful examples that would distinguish how WikiData Categories and what one can extract with Magnus' reasonator tool from what one can 'extract' with SemanticWiki from WikiData Categories?
Lydia, are there any emerging video tutorials (in the manner of some of Google's, for example) of tutorials about WikiData Categories and/or Wikidata itself, in terms of making it very easy to use?
Thanks, Scott
On Fri, Jul 4, 2014 at 4:40 AM, Jane Darnell jane023@gmail.com wrote:
Q17 is Japan, and if you are interested in people from Japan for example, you can do this: http://tools.wmflabs.org/reasonator/?q=7463305 (thanks to Magnus' reasonator tool that can extract category-like info from Wikidata based on properties and qualifiers)
On Fri, Jul 4, 2014 at 9:01 AM, Daniel Kinzler < daniel.kinzler@wikimedia.de> wrote:
Am 04.07.2014 07:10, schrieb Rohan Badlani:
I had downloaded the wikidata dump from http://dumps.wikimedia.org/wikidatawiki/latest/ There is a file wikidatawiki-20140420-pages-articles-multistream-index
which
consists of triplets like:
537:114:Q17
I couldn't find documentation for the multistream-index format at https://meta.wikimedia.org/wiki/Data_dumps. I can't make sense of it myself offhand. Perhaps ask on the wikitech-l list. I suppose the authority on the question would be Ariel Glenn, perhaps you can get hold of him on IRC.
Note that this format is used for all wikis, so it will not contain anything that is specific to Wikidata. It would be the same for Wikipedia.
If you figure it out, please add the info to https://meta.wikimedia.org/wiki/Data_dumps!
which I interpreted as following: 537 - category of the topic (which I am unable to find. I want the
details of
this item)
It's not a category. Wikidata doesn't use MediaWiki's Category feature for data items at all. Wikipedia does, but there pages generally have multiple categories, identified by name, not a numeric ID.
If you want to build a classification graph of the concepts in Wikidata (I'm intentionally avoiding the terms "ontology" and "taxonomy" here), you will have to go by the properties P31 (instance of) and P279 (subclass of) which are used in many (roughly half) of the data items.
114 - page_id of the item Q17.
That seems to be correct.
Q17 - which is the item. (JSON: https://www.wikidata.org/wiki/Special:EntityData/Q17.json)
It's the page title, which, on wikidata.org, is the same as the item ID.
HTH Daniel
PS: we are close to providing JSON dumps on a regular basis, and also make the JSON contained in the XML dumps more readable. This will hopefully make analyzing Wikidata less painful.
-- Daniel Kinzler Senior Software Developer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l