Sebastian Graf wrote:
Hello everybody,
I am a worker at the computer science departement at the University of
Konstanz in Germany. We are working on a revisioned native XML
database. Wikipedia is therefore the optimal playground when it comes
to huge amounts of data since the xml dump is perfect for our
application.
At the moment I am looking for a new dump for the enwiki which
contains all revisions. I know that this XML has to be really huge,
but that's why we want to use it. Unfortunately I couldn't find any
file called "page-meta-history" on the enwiki download section. Can
you help me with some dump, an idea how to get the data,...?
Were currently redoing the full history backups so that they are
scalable. They were taking months to run and usually not finishing. Thus
the history+revision portion was turned off.
Were hoping to have it back online in the next two months with a new
system in place.
For now you still have all of the current page text, metadata of
revisions, logs and other parts of the dump available for enwiki
available at
http://download.wikipedia.org/enwiki/
Their are four successful runs of the enwiki run starting 20090512
ending 20090604 (20090610 didn't get through it's logging table)
You just won't find the text+revisions for enwiki. If you really have to
have a text+revisions then I'd suggest pulling from any one of the other
790+ dumps as they all include it.
Were very eager to get it back up as a lot of people are really
interested in the data.
--tomasz