Hi mediawiki developers,
We (Google) are trying to maintain our internal mirror of wikidata.orgup-to-date in real-time, so that our indexing pipeline can get latest interlanguage information in real-time.
I noticed wikidata.org is also powered by standard MediaWiki software, where standard query api to a specific revision works, e.g. revision query: http://www.wikidata.org/w/api.php?action=query&prop=revisions&format... and recentchanges: http://wikidata.org/w/api.php?action=query&list=recentchanges&format...
My questions are:
- Are the APIs above ("action=query&prop=revisions" and "actioon=query&list=recentchanges") the supported way to retrieve wikidata.org in realtime? - Is there any document about the json format in response? It looks to me that "links" are about interwiki/interlanguage links (or sitelinks in wikidata.org's terminology), but I feel more comfortable if I see some official document about this. - There are some ids like "dewiki", "enwiki" etc, which I guess can be interpreted to corresponding languages "de", "en" respectively. But is there a reliable map from these *wiki to the language code? And some are even using 3-letter prefix, e.g. gotwiki, xmfwiki.
Thanks
Jiang,
* Wikidata API has not been finalized, and will be rewritten (with some time overlap to allow for migration). See the proposed Wikidata API roadmaphttps://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API, and feel free to make suggestions now before implementation begins. Hence - there isn't that much in terms of documentation yet.
* The wikidata extension currently does not extend the action=query, but instead adds its own. Making Wikidata API integrated into the action=query is my prime intention as part of that roadmap. In the future you will be able to do generator=recentchanges&props=wblanglinks to get the list of langlinks from all the recently changed items in one API call.
* At this point all queries like list=recentchangeshttp://www.wikidata.org/w/api.php?action=query&list=recentchanges&rcnamespace=0&rcprop=comment&rclimit=max&rcnamespace=0 and prop=revisionshttp://www.wikidata.org/w/api.php?action=query&titles=Q541001&prop=revisions&rvlimit=max already list all langlink changes to the items, but only by analyzing the comment or the actual change can you check if it was a langlink or a claim change (by using one of the wikibase actions)
* Overall API is moving away from the multi-formats like xml+wddx+php in favor of a single one - json, so in case you are still using xml i suggest migrating. See the main mediawiki API roadmaphttps://www.mediawiki.org/wiki/Requests_for_comment/API_Future for other proposed api changes.
-- Yuri Astrakhan (yurik)
P.S. gmail automatically puts your emails into spam because it thinks its a phishing attack (the mail server forwards it as-if it came from you)
Thanks for you reply Yuri. comments inline.
On Mon, Mar 4, 2013 at 9:09 PM, Yuri Astrakhan yuriastrakhan@gmail.comwrote:
Jiang,
- Wikidata API has not been finalized, and will be rewritten (with some
time overlap to allow for migration). See the proposed Wikidata API roadmap https://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API, and feel free to make suggestions
by "make suggestions" you mean I can click "edit" of a section and type in any text directly? or there is some dedicated page/forum/maillist to input? (sorry for my ignorance, I'm quite new to MediaWiki developing)
now before implementation begins. Hence - there isn't that much in terms of documentation yet.
- The wikidata extension currently does not extend the action=query, but
instead adds its own. Making Wikidata API integrated into the action=query is my prime intention as part of that roadmap. In the future you will be able to do generator=recentchanges&props=wblanglinks to get the list of langlinks from all the recently changed items in one API call.
- At this point all queries like list=recentchangeshttp://www.wikidata.org/w/api.php?action=query&list=recentchanges&rcnamespace=0&rcprop=comment&rclimit=max&rcnamespace=0 and
prop=revisionshttp://www.wikidata.org/w/api.php?action=query&titles=Q541001&prop=revisions&rvlimit=max already list all langlink changes to the items, but only by analyzing the comment or the actual change can you check if it was a langlink or a claim change (by using one of the wikibase actions)
I don't understand what is a "claim change", could you explain a little more or point me some doc? "wikibase" is an alias of "wikidata.org", right?
- Overall API is moving away from the multi-formats like xml+wddx+php in
favor of a single one - json, so in case you are still using xml i suggest migrating. See the main mediawiki API roadmaphttps://www.mediawiki.org/wiki/Requests_for_comment/API_Future for other proposed api changes.
Thanks for pointing me the overall roadmap. I'll plan to migrate soon.
-- Yuri Astrakhan (yurik)
P.S. gmail automatically puts your emails into spam because it thinks its a phishing attack (the mail server forwards it as-if it came from you)
anything wrong in my email setting I can avoid being recognized as spam?
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
Hey,
"wikibase" is an alias of "wikidata.org", right?
Wikibase is the name of the software. Wikidata is the name of the project for which this software is developed. Same difference as between MediaWiki and Wikipedia.
* At this point all queries like list=recentchangeshttp://www.wikidata.org/w/api.php?action=query&list=recentchanges&rcnamespace=0&rcprop=comment&rclimit=max&rcnamespace=0 and
prop=revisionshttp://www.wikidata.org/w/api.php?action=query&titles=Q541001&prop=revisions&rvlimit=max already list all langlink changes to the items, but only by analyzing the comment or the actual change can you check if it was a langlink or a claim change (by using one of the wikibase actions)
I don't understand what is a "claim change", could you explain a little more or point me some doc?
Entities on a Wikibase repository contain more then just sitelinks. They all have a list of zero or more claims [0], plus a set of labels, descriptions and aliases. Items, the first type of Entity, contains sitelinks as well. Properties, the second type of Entity, contain a data type, which can also be changed. And Queries, yet another type of Entity, contain yet more stuff. So if you only care about sitelink changes, you want to ignore changes to all the other data. This ought to be pretty straight forward though.
Is it the case you are just interested in mirroring the sitelinks?
[0] https://meta.wikimedia.org/wiki/Wikidata/Data_model#Statements(Statement is a Claim)
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. --
- Wikidata API has not been finalized, and will be rewritten (with some
time overlap to allow for migration). See the proposed Wikidata API roadmaphttps://www.mediawiki.org/wiki/Requests_for_comment/Wikidata_API, and feel free to make suggestions
by "make suggestions" you mean I can click "edit" of a section and type in any text directly? or there is some dedicated page/forum/maillist to input? (sorry for my ignorance, I'm quite new to MediaWiki developing)
Best place to start - on the "talk" page (at the top), post your comments, and people will reply. Should be the best way to start.
now before implementation begins. Hence - there isn't that much in terms
of documentation yet.
P.S. gmail automatically puts your emails into spam because it thinks its a phishing attack (the mail server forwards it as-if it came from you)
anything wrong in my email setting I can avoid being recognized as spam?
I'm afraid there is not much you can do, maybe except using a non
google.com account. I had to mark your email as "non-phishing", which sent a copy of the letter to google to examine. Will see if they can come up with something :)
mediawiki-api@lists.wikimedia.org