On 13/08/13 22:26, Claus Stadler wrote:
Hi Markus,
Thank you for your information. Below some thoughts and comments from our side.
This is because the Wikidata datatype for numbers is not implemented
yet. Ok, is there a timeline when this is planned to be ready?
(Lydia answered this)
To edit Wikidata, you should create an account.
But one can use an existing Wikidata account for performing edits via the Wikidata API' "login" action, right?
Yes.
If you intend to do mass edits, ...
I don't think we are. The new DBpedia interface should just support users by transferring selected DBpedia data items to WD *via their own WD account*.
Ok, fair enough. One can also do anonymous edits, of course (but people should be warned then that their IP will be recorded publicly).
I could imagine that most of the data gets bulk imported from
Wikipedia infoboxes by the community anyway.
By "bulk imported from Wikipedia infoboxes by the community " are you referring to an automatic or manual process? For anything manual, the idea is to use the DBpedia-Viewer as a support tool, as it already contains the data from the infoboxes. If its automatic, can you explain on how the data is extracted from Wikipedia?
I guess "semi-automatic" best describes it. What usually happens is that some user proposes an import for some specific property (e.g., import sex information from Italian Wikipedia categories Man/Woman), and then this is done. I cannot explain in all cases how users get the information; it's quite amazing what they do ;-) However, there is no manual control of all imported facts, and errors have been known to happen. There are quality control mechanisms on Wikidata to find problems with the current data (whether imported or not).
Note that there are some differences beyond the vocabulary. Most
Wikidata statements have source information attached
The source for an item from DBpedia is the Wikipedia for the corresponding language.
Yes, that seems to be a good idea.
, and there are also qualifiers (not used heavily yet, since the
selection/filtering mechanisms of Wikidata are quite weak so far, but this will change). In some domains, such as roles of actors in films, quantifiers are getting widely used now; so this is not really triple data any more. But there will be enough triple data left, I guess.
As for qualifiers, they don't really exist on DBpedia, so if someone wanted to provide them via the DBpedia-Viewer, one would have to be provide them manually anyway. I am currently not sure, if they are a priority to us.
There will be enough cases where they are not needed. With respect to the viewer, the bigger challenge is probably to avoid duplicates/redundant information (entering a property without any qualifier is fine if there is nothing at all yet; but if there is already one with a more specific qualifier, then nothing else should be added).
So for the DBpedia-Viewer's "transfer DBpedia triple to Wikidata" feature we see three options, whereas only (c) seems feasible: (a) The viewer just provides a link the Wikidata, and the user has to fill out the forms there manually. But we would like a better interaction between the RDF and WD. (b) Wikidata offers a way to open an item with a pre-filled out form. However, on WD it currently seems only a single item can be edit mode. So this won't work yet, and not sure if this is ever planned to work.
(c) The DBpedia-Viewer aids the user by providing a pre-filled out edit-form by mapping a triple's propery and object to the corresponding WD values. For validation, the user could be presented existing WD-values for that property. Also, upon edit, a popover or tab with the corresponding WD page could open up.
Yes, I agree that (c) is most convenient. Since all of the WD interface is coded in Javascript, using the Web API for data exchange, one can create custom UIs with similar functionality quite well. In fact, there are user-contributed Javascript modules that can be activated on wikidata.org to get alternative/additional UIs for editing. So this can be integrated into the web site quite easily if the code is there.
So this kind of edit seems to be possible to do with the API, yet we would need a mapping between WD RDF and the WD ID's.
All URIs in WD RDF contain the relevant IDs already as substrings. Moreover, the WD RDF URIs are resolvable and support content negotiation (though most formats are quite limited, e.g., the RDF is not the complete RDF that you have in the dumps yet). Is there any further mapping you need?
Markus
Cheers, Claus
p.s:
Note that all property ids start with a P. If it's of the form Q...,
then it is not a property. Oops ;)
On 08/13/2013 09:32 PM, Markus Krötzsch wrote:
Hi Claus,
a brief partial reply:
On 13/08/13 16:20, Claus Stadler wrote: ...
For example, I notice that the Wikidata page for my home town "Berndorf in Lower Austria" does not contain the population: http://www.wikidata.org/wiki/Q666615
This is because the Wikidata datatype for numbers is not implemented yet.
Looking at the corresponding DBpedia entry, this information actually exists there: http://dbpedia.org/resource/Berndorf,_Lower_Austria
The new DBpedia interface should offer a button next to the "population 8728" triple which enables transfer of this information to Wikidata.
To edit Wikidata, you should create an account. If you intend to do mass edits, this account should be granted bot status first to avoid it from being blocked if it sends a lot of requests. This is mostly a community process: you should discuss the intended edit activities with the community to find out if they are happy with this (this list is only about the technical aspects). It is good to have additional inputs, but I could imagine that most of the data gets bulk imported from Wikipedia infoboxes by the community anyway, which is what happens with a lot of data right now.
In another GSoC project, Hady Elsahar is working on mappings between the wikidata RDF vocabulary and the DBpedia vocabulary. This means, we can in principle map DBpedia RDF data to Wikidata RDF.
Note that there are some differences beyond the vocabulary. Most Wikidata statements have source information attached, and there are also qualifiers (not used heavily yet, since the selection/filtering mechanisms of Wikidata are quite weak so far, but this will change). In some domains, such as roles of actors in films, quantifiers are getting widely used now; so this is not really triple data any more. But there will be enough triple data left, I guess.
However, looking at the Wikidata API [2] there is
action=wbcreateclaim * with the example:
api.php?action=wbcreateclaim&entity=q42&property=p9001&snaktype=novalue&token=foobar&baserevid=7201010
So the core question is, how can we map e.g. properties such as wikidata:population (if that existed) to their respective Wikidata property identifier (Q12345)? This goes for any property that may occur in an RDF dump, such as: http://www.wikidata.org/wiki/Special:EntityData/Q666615.nt
Note that all property ids start with a P. If it's of the form Q..., then it is not a property.
Cheers,
Markus