Hello
I think Apache CouchDB would be a great fit to address the issue of keeping up to date with the whole database. Quoting Wikipedia article:

Main features
[...]
Distributed Architecture with Replication
CouchDB was designed with bi-direction replication (or synchronization) and off-line operation in mind. That means multiple replicas can have their own copies of the same data,  modify it, and then sync those changes at a later time.


Wikimedia could run a CouchDB instance updated live, or, if not possible, on the same regularity as dumps. People interested could either run their own instance live mirroring Wikimedia master instance (using replication), or simply from time to time make a request to know which entities changed (using the _changes endpoint)

I guess the first replication will take more time/be more resource intensive than a simple file dump, but that would be compensated quickly on the following differential updates.

This would be beautiful :)

And I already have a use case for my Wikidata Subset Search Engine \o/

Let me know if I can help on making it happen

Bests,

Maxime

--
Maxime Lathuilière
maxlath.eu - twitter
inventaire.io - roadmap - code - twitter - facebook
wiki(pedia|data): Zorglub27
for personal emails use max@maxlath.eu instead

Le 27/08/2016 à 21:20, Stas Malyshev a écrit :
Hi!

I know it's been mentioned on this list before, but it would be
incredibly useful to have incremental dumps of Wikidata, as downloading
the current dumps can now take several hours over a poor-bandwidth
Internet connection.
See also: https://phabricator.wikimedia.org/T85101