Sebastian Graf wrote:
Hello everybody,
I am a worker at the computer science departement at the University of Konstanz in Germany. We are working on a revisioned native XML database. Wikipedia is therefore the optimal playground when it comes to huge amounts of data since the xml dump is perfect for our application.
At the moment I am looking for a new dump for the enwiki which contains all revisions. I know that this XML has to be really huge, but that's why we want to use it. Unfortunately I couldn't find any file called "page-meta-history" on the enwiki download section. Can you help me with some dump, an idea how to get the data,...?
Were currently redoing the full history backups so that they are scalable. They were taking months to run and usually not finishing. Thus the history+revision portion was turned off.
Were hoping to have it back online in the next two months with a new system in place.
For now you still have all of the current page text, metadata of revisions, logs and other parts of the dump available for enwiki available at
http://download.wikipedia.org/enwiki/
Their are four successful runs of the enwiki run starting 20090512 ending 20090604 (20090610 didn't get through it's logging table)
You just won't find the text+revisions for enwiki. If you really have to have a text+revisions then I'd suggest pulling from any one of the other 790+ dumps as they all include it.
Were very eager to get it back up as a lot of people are really interested in the data.
--tomasz