On Wed, Jan 19, 2011 at 3:33 AM, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Wed, Jan 19, 2011 at 3:59 AM, Anthony wikimail@inbox.org wrote:
Why isn't this being used for the dumps?
Well, the relevant code is totally unrelated, so the question is sort of a non sequitur.
No, the question is why the relevant code is totally unrelated. Specifically, I'm talking about the full history dumps.
If you mean "Why don't we have incremental dumps?"
No, that's not the question. The question is why are you uncompressing and undiffing (from DiffHistoryBlobs) only to recompress (to bz2) and then uncompress and recompress (to 7z) when you can get roughly the same compression by just extracting the blobs and removing any non-public data. Or, if it's easier, continue to uncompress (in gz) and undiff then rediff and recompress (in gz), as that will be much much faster than compressing in bz2.
You'll also wind up with a full history dump which is *much* easier to work with. Yes, you'll break backward compatibility, but considering that the English full history dump never finishes, even if you just implemented it for that one it'd be better than the present, which is to have nothing.
I'm assuming the answer is (as usual in software development) that there are higher-priority things to do right now.
And there are lots of lower-priority things that are being done. And lots of dollars sitting on the sidelines doing nothing.