Antoine Amarilli wrote:
Hi all,
I was wondering if there are any plans to provide incremental dumps (ie. the diff between each dump and the previous one) at download.wikimedia.org. It seems to me that such diffs would help save bandwidth because mirrors could stay up to date by downloading the diffs and applying them rather than downloading the whole dump each time.
I am new here, so I hope that this mailing-list is the correct place for this kind of suggestions. If I am wrong, please tell me.
Regards,
This mailing list is not bad, but you have a more specific one at https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
The idea is not bad, but we would probably need to roll out a custom diff script to ensure it keeps memory usage low (that's not necessarily hard). But if we were going to do that, we could as well provide the differences in xml format instead for easy appliying. The result of appliying a diff xml to an existing wiki is straghtforward, but what to do with removed pages?