New subject: Importing Wikipedia XML Dumps into MediaWiki

9 Mar 2009


      Thanks Joshua. I would prefer that you post to the Mailing List / Newsgroup – so that all can benefit from your ideas.
--- El dom 8-mar-09, Joshua C. Lerner jlerner@gmail.com escribió:
...
De: Joshua C. Lerner jlerner@gmail.com
Asunto: Re: [Wikitech-l] Importing Wikipedia XML Dumps into MediaWiki
Just for kicks I decided to try to do an import of ptwiki -
using what
I learned in bringing up mirrors of 4 Greek and 3 English
Wikimedia
sites, including Greek Wikipedia. Basically I had the best
luck with
Xml2sql (http://meta.wikimedia.org/wiki/Xml2sql). The
conversion from
XML to SQL went smoothly:
# ./xml2sql /mnt/pt/ptwiki-20090128-pages-articles.xml
As did the import:
# mysqlimport -u root -p --local pt
./{page,revision,text}.txt
Enter password:
pt.page: Records: 1044220  Deleted: 0  Skipped: 0 
Warnings: 0
pt.revision: Records: 1044220  Deleted: 0  Skipped: 0 
Warnings: 3
pt.text: Records: 1044220  Deleted: 0  Skipped: 0 
Warnings: 0
I'm running maintenance/rebuildall.php at the moment:
# php rebuildall.php
** Rebuilding fulltext search index (if you abort this will
break
searching; run this script again to fix):
Dropping index...
Rebuilding index fields for 2119470 pages...
442500
(still running)
I'll send a note to the list with the results of this
experiment. Let
me know if you need additional information or help. Are you
trying to
set up any mirrors?
Joshua
Thanks for making this attempt. Let me know if your rebuildall.php has memory issues.
This is really getting confusing for me – because there are so many ways – all of which guaranteed to work – that work, and the one that is recommended – does not seem to work.
I would try out your approach too – but it would take time as I only have one computer to spare.
Thanks,
O.o.
¡Sé el Bello 51 de People en Español! ¡Es tu oportunidad de Brillar! Sube tus fotos ya. http://www.51bello.com/