On Nov 18, 2007 1:32 PM, Simetrical Simetrical+wikilist@gmail.com wrote:
On 11/18/07, Anthony wikimail@inbox.org wrote:
Oh, God, it's in python? Nevermind.
What, you prefer PHP?
At least I can understand PHP. But, it turns out most of it *is* in PHP, mostly these two files:
*http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/maintenance/dumpBacku... *http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/maintenance/dumpTextP...
I read though dumpBackup.php which seems pretty straightforward, just does a "SELECT /*! STRAIGHT_JOIN */ * FROM page, revision, text WHERE page_id=rev_page ORDER by page_id" and puts it into stub-meta-history. dumpTextPass is going to then go through stub-meta-history and fill in the actual text, but I haven't read that yet.
For the immediate future a way to restart a broken dump is probably the most important. Find the last ~900K segment of the bz2 file, remove it, add the bzip2 end of file information, then concatenate the rest of the dump? Sound reasonable?