Hi everyone. We need your help again.

We finally have a working mirror for generating the static html version of ESWIKI we need for Cd-Pedia using DumpHTML extension.
But it seems that the process will take about 3000 hours of processing in our little sempron server (4 months!).

How many time could it take in Wikimedia's servers?


Thanks

(This is intentional top-posting to update quickly the situation)


El 1 de junio de 2010 18:53, Ángel González <keisial@gmail.com> escribió:
El 30/04/10 17:43, Alejandro J. Cura escribió:
> Hi everyone, we need your help.
>
> We are from Python Argentina, and we are working on adapting our
> cdpedia project to make a DVD together with educ.ar and Wikimedia
> Foundation, holding the entire Spanish Wikipedia that will be sent
> soon to Argentinian schools.
>
> Hernán and Diego are the two interns tasked with updating the data
> that cdpedia uses to make the cd (it currently uses a static html dump
> dated June 2008), but they are encountering some problems while trying
> to make an up to date static html es-wikipedia dump.
>
> I'm ccing this list of people, because I'm sure you've faced similar
> issues when making your offline wikipedias, or because maybe you know
> someone who can help us.
>
> Following is an email from Hernán describing the problems he's found.
>
> thanks!
> -- alecu - Python Argentina 2010/4/30 Hernan Olivera
> <lholivera@gmail.com>: Hi everybody, I've been working on making an up
> to date static html dump for the spanish wikipedia, to use as a basis
> for the DVD. I've followed the procedures detailed in the pages below,
> that were used to generate the current (and out of date) static html
> dumps: 1) installing and setting up a mediawiki instance 2) importing
> the xml from [6] with mwdumper 3) exporting the static html with
> mediawiki's tool The procedure finishes without throwing any errors,
> but the xml import produces malformed html pages that have visible
> wikimarkup. We would really need to have a successful import from the
> spanish xmls to a mediawiki instance so we can produce the up to date
> static html dump. Links to the info I used: [0]
> http://www.mediawiki.org/wiki/Manual:Installation_guide/es [1]
> http://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Ubuntu [2]
> http://en.wikipedia.org/wiki/Wikipedia_database [3]
> http://www.mediawiki.org/wiki/Manual:Importing_XML_dumps [4]
> http://meta.wikimedia.org/wiki/Importing_a_Wikipedia_database_dump_into_MediaWiki
> [5] http://meta.wikimedia.org/wiki/Data_dumps [6]
> http://dumps.wikimedia.org/eswiki/20100331/ [7]
> http://www.mediawiki.org/wiki/Alternative_parsers (among others)
> Cheers, --
Hola Hernán,

You may have used one of the corrupted dumps. See
https://bugzilla.wikimedia.org/show_bug.cgi?id=18694
https://bugzilla.wikimedia.org/show_bug.cgi?id=23264

Otherwise, did you install parserfunctions and other extensions needed?




--
Hernan Olivera