Hey,
Are there any ideas for some kind of incremental dump? Or are bandwith and disk storage no problem compared to the more complex implementation of incremental dumps? And are there any statistics about how much bandwidth is used by downloads and database dumps (i.e. not normal visitor traffic)?
Incremental dumps have been always constant rave by new developers. Maybe it would be not so sophisticated to provide streams of changed articles, but doing proper sync yet is possible only by replicating SQL commands.
We do have currently incremental dumps (eh, MySQL binlogs), but those, unlike public dump service, contain internal tables (with all sensitive information). Providing those into public we either would have to set yet another mysql slave, with limited replication, or somehow filter binlogs, which is PITA as well.
There's yet another issue, with 1.5 major database redesign will happen, which will force us to provide non-SQL dumps. We're even not sure if revision texts will be kept in MySQL in future.
Cheers, Domas