Hi,
sorry for another long Email today.
Currently, when you change a Wikidata item, its associated Wikipedia articles get told to update, too. So your change to the IMDB ID of a movie in Wikidata will be pushed to all language versions of that article on Wikipedia. Yay!
There are two use cases that currently are not possible:
* a Wikipedia article on a city might display the mayor. Now someone changes on Wikidata the label of the mayor - the Wikipedia article will get updated the next time the page is rendered, but there is no active update of the page.
* a Wikipedia article might want to include data about another item than the associated item - most importantly for references, where I might be interested in the author of a book, it's year of publication, etc. This feature is currently disabled (even though it would be trivial to switch it on) because this information would only get updated when the page is actively rerendered.
In order to enable these use cases we need to track on which pages (on Wikipedia) an item (from Wikidata) is used. We are thinking of doing this in two tables:
* EntityUsage: one table per client. It has two columns, one with the pageId and one with the entityId, indexed on both columns (and one column with a pk, I guess, for OSC).
* Subscriptions: one table on the client. It has two columns, one with the pageId and one with the siteId, indexed on both columns (and one column with a pk, I guess, for OSC).
EntityUsage is a potentially big table (something like pagelinks-size).
On a change on Wikidata, Wikidata consults the Subscriptions table, and based on that it dispatches the changes to all clients listed there for a given change. Then the client receives the changes and based on the EntityUsage table performs the necessary updates.
We wanted to ask for input on this approach, and if you see problems or improvements that we should put in.
Cheers, Denny
Small correction.
2013/7/22 Denny Vrandečić denny.vrandecic@wikimedia.de
- Subscriptions: one table on the client. It has two columns, one with the
pageId and one with the siteId, indexed on both columns (and one column with a pk, I guess, for OSC).
That's entityId -> siteId, not pageId to siteId.
Another correction, same line. Gosh, it's hot here. Brain not working. Me off home.
2013/7/22 Denny Vrandečić denny.vrandecic@wikimedia.de
2013/7/22 Denny Vrandečić denny.vrandecic@wikimedia.de
- Subscriptions: one table on the client. It has two columns, one with
the pageId and one with the siteId, indexed on both columns (and one column with a pk, I guess, for OSC).
That's entityId -> siteId, not pageId to siteId.
And that's repo. Not client.
wikidata-tech@lists.wikimedia.org