On 10/20/07, Platonides <Platonides(a)gmail.com> wrote:
Gregory Maxwell wrote:
Bleh. Someone pulling increments couldn't
build a point in time
snapshot, they would need to always pull the full. And we want people
using point in time versions of the site not mangled mixes.
They'd use the stubs version.
Okay, you didn't mention that.... but please no: I have had a hard
enough time explaining to people that the separate SQL dumps aren't
consistent with the history dumps.
I don't want to end up in a situation where the only way to get a sane
copy of the site is stitching together dozens of files on the
recipients side.... people will do it wrong, or just skip building a
point in time version at all.. and make a big mess.
I'd rather go back to having separate metadata and text dumps than end
up with people needing to combine an old full dump, N large
incremental files, and a new stub dump through a bunch of complex
manipulation in order to arrive at a consistent copy of the site.
If we wanted to do that on the back end.. fine.
Also, I expect
that once 7zed the incremets will not be too much
smaller than the full, especially if partitoned by revid.
I wasn't proposing a
file per revid, but a file per N revisions, where N
is a number which fits our needs ;-)
Partition by revid doesn't necessarily mean one rev per file... and
thats certainly not what I thought you were suggesting.
You will screw compression if you partition by revid (i.e. in groups
of revs, failing to keep all revs of a single article in one place).
If you don't want to take my word for it try it yourself.