Is it easy to brief the added value (or supported use cases) by switching
The edit stream in Wikidata is so huge that I can hardly think of anyone
wanting to be in *real-time* sync with Wikidata
With 20 p/s their infrastructure should be pretty scalable to not break.
Maybe I am biased with DBpedia but by doing some experiments on English
Wikipedia we found that the ideal update with OAI-PMH time was every ~5
OAI aggregates multiple revisions of a page to a single edit
so when we ask: "get me the items that changed the last 5 minutes" we skip
the processing of many minor edits
It looks like we lose this option with PubSubHubbub right?
As we already asked before, does PubSubHubbub supports mirroring a wikidata
clone? The OAI-PMH extension has this option
On Tue, Jul 8, 2014 at 11:31 AM, Daniel Kinzler <daniel.kinzler(a)wikimedia.de
Replying to myself because I forgot to mention an
Am 08.07.2014 10:22, schrieb Daniel Kinzler:
Am 08.07.2014 01:46, schrieb Rob Lanphier:
> On Fri, Jul 4, 2014 at 7:16 AM, Lydia Pintscher <
> Hi Lydia,
> Thanks for providing the basic overview of this. Could you (or someone
> team) provide an explanation about how you
would like this to be
We'd like to enable it just on Wikidata at first, but I see no reason
enable it for all projects if that goes well.
The PubSubHubbub (PuSH) extension would be configured to push
the google hub (two per edit). The hub then
notifies any subscribers via
We need a proxy to be set up to allow the app servers to talk to the
If this is deployed on full scale, we expect in excess of 20 POST requests
second (two per edit), plus up to the same number (but probably fewer) of
requests coming back from the hub, asking for the full page content of
page changed, as XML export, from a special page interface similar to
Special:Export. This would probably bypass the web cache.
PubSubHubbub is nice and simple, but it's really designed for news feeds,
for versioned content of massive collaborative sites. It works, but it's
efficient as we could wish.
Senior Software Developer
Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-tech mailing list