Hi everyone. We need your help again.
We finally have a working mirror for generating the static html version of ESWIKI we need for Cd-Pedia using DumpHTML extension. But it seems that the process will take about 3000 hours of processing in our little sempron server (4 months!).
How many time could it take in Wikimedia's servers?
Thanks
(This is intentional top-posting to update quickly the situation)
El 1 de junio de 2010 18:53, Ángel González keisial@gmail.com escribió:
El 30/04/10 17:43, Alejandro J. Cura escribió:
Hi everyone, we need your help.
We are from Python Argentina, and we are working on adapting our cdpedia project to make a DVD together with educ.ar and Wikimedia Foundation, holding the entire Spanish Wikipedia that will be sent soon to Argentinian schools.
Hernán and Diego are the two interns tasked with updating the data that cdpedia uses to make the cd (it currently uses a static html dump dated June 2008), but they are encountering some problems while trying to make an up to date static html es-wikipedia dump.
I'm ccing this list of people, because I'm sure you've faced similar issues when making your offline wikipedias, or because maybe you know someone who can help us.
Following is an email from Hernán describing the problems he's found.
thanks! -- alecu - Python Argentina 2010/4/30 Hernan Olivera lholivera@gmail.com: Hi everybody, I've been working on making an up to date static html dump for the spanish wikipedia, to use as a basis for the DVD. I've followed the procedures detailed in the pages below, that were used to generate the current (and out of date) static html dumps: 1) installing and setting up a mediawiki instance 2) importing the xml from [6] with mwdumper 3) exporting the static html with mediawiki's tool The procedure finishes without throwing any errors, but the xml import produces malformed html pages that have visible wikimarkup. We would really need to have a successful import from the spanish xmls to a mediawiki instance so we can produce the up to date static html dump. Links to the info I used: [0] http://www.mediawiki.org/wiki/Manual:Installation_guide/es [1] http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu [2] http://en.wikipedia.org/wiki/Wikipedia_database [3] http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps [4]
http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_Medi...
[5] http://meta.wikimedia.org/wiki/Data_dumps [6] http://dumps.wikimedia.org/eswiki/20100331/ [7] http://www.mediawiki.org/wiki/Alternative_parsers (among others) Cheers, --
Hola Hernán,
You may have used one of the corrupted dumps. See https://bugzilla.wikimedia.org/show_bug.cgi?id=18694 https://bugzilla.wikimedia.org/show_bug.cgi?id=23264
Otherwise, did you install parserfunctions and other extensions needed?