On Wed, Mar 4, 2009 at 5:05 AM, Robert Ullmann <rlullmann(a)gmail.com> wrote:
For every 1000th pageid, get the earliest rev and note
the date and
time. (If a given ID is missing, i.e. deleted, hunt around it, +1, -1,
+2, -2 etc 'till you find one.)
To get the date and time for a particular pageid, interpolate between
the next higher and lower 1000th. This should pretty much always get
you the correct date, with some chance of it being off by one for
pages created near midnight UTC.
That's a clever idea. As it turns out, using the stub dump wasn't bad;
I was able to assign earliest revision dates and revision counts to
the articles in WEX overnight.