In that RFC I plan to unify Wikidata API to be more seamlessly integrated
with the core query API. Any feedback from pywiki community is welcome.
On Sat, Mar 2, 2013 at 10:49 AM, Maarten Dammers <maarten(a)mdammers.nl>wrote;wrote:
Hi everyone,
As you might know phase 1 of Wikidata (interwiki links) is live at a lot
of Wikipedia's and soon to be turned on for all Wikipedia's. Phase 2 is
next, that's basically about infobox data. We are going to need a lot of
clever bots to fill Wikidata. To make that possible Pywikipedia should
(properly) implement Wikidata. That way bot authors don't have to worry or
care about the inner workings of the Wikidata api, they just talk to the
framework. At the moment trunk has a first implementation that isn't very
clean and in the rewrite it's still missing.
Legoktm and I talked about this on irc. We need to have a proper data
model in Pywikipedia. Based on
https://meta.wikimedia.org/**
wiki/Wikidata/Notes/Data_**model_primer<https://meta.wikimedia.org/wiki/…er>:
* WikibasePage is a subclass of Page and has some basic shared functions
for labels, descriptions and aliases
* ItemPage is a subclass of WikibasePage with some item specific functions
like claims and sitelinks (example
https://www.wikidata.org/wiki/**Q256638<https://www.wikidata.org/wiki/Q2…
)
* PropertyPage is a subclass of WikibasePage with some property specific
functions for the datatype (example
https://www.wikidata.org/wiki/**
Property:P22 <https://www.wikidata.org/wiki/Property:P22>)
* QueryPage is a subclass of WikibasePage for the future query type
* Claim is a subclass of object for claims. Simplified: It's a property
(P22, father) attached to an item (Q256638, the princes) linking to another
item (Q380949, Willem IV)
You can get these pages like a normal page (site object + title), but you
probably also want to get them based on a Wikipedia page. For that there is
https://www.wikidata.org/wiki/**Special:ItemByTitle/enwiki/**
Princess%20Carolina%20of%**20Orange-Nassau<https://www.wikidata.org/wiki…au>.
We should have a staticmethod itemByPage(Page) in which Page is
https://en.wikipedia.org/wiki/**Princess_Carolina_of_Orange-**Nassau<htt…
it will give you the itemPage object for
https://www.wikidata.org/wiki/**Q256638<https://www.wikidata.org/wiki/Q2…38>.
Currently in trunk the DataPage object has a constructor where you can give
a page object and you'll get the corrosponding dataPage. I don't think
that's the way to do it because it violates the data model and will get us
in a lot of trouble later on when other sites (like Commons) might
implement the Wikibase extension.
A WikibasePage should work the same as a normal page when it comes to
fetching data. It should have the initial version (just a title, no
content) and once you use a function that needs data (or you force it), it
will fetch all the data from Wikibase and caches it.
* For an item the data looks like
https://www.wikidata.org/w/**
api.php?action=wbgetentities&**ids=Q256638&format=json<https://w…
* For a property the data looks like
https://www.wikidata.org/w/**
api.php?action=wbgetentities&**ids=P22&format=json<https://www.w…
Parts of the data (description, aliases and labels) should be processed in
the get function of WikibasePage, other parts in ItemPage /PropertyPage
Based on the api we should probably have some generators:
* One or more generator that uses wbgetentities to (pre-)fetch objects
* A search generator that uses wbsearchentities
WikibasePage:
* Set/add/delete label (@property?)
* Set/add/delete description (@property?)
* Set/add/delete alias (@property?)
ItemPage
* Set/add/delete sitelink (@property?)
Claim logic
Not sure how we can use wbeditentity and wblinktitles
We took some notes on
https://www.mediawiki.org/**
wiki/Manual:Pywikipediabot/**Wikidata/Rewrite_proposal<https://www.media…al>.
What do you think? Is this the right direction? Feedback is appreciated.
Maarten
______________________________**_________________
Pywikipedia-l mailing list
Pywikipedia-l(a)lists.wikimedia.**org <Pywikipedia-l(a)lists.wikimedia.org>
https://lists.wikimedia.org/**mailman/listinfo/pywikipedia-l<https://lis…