Hi Yoni,
On Mon, Sep 7, 2015 at 6:01 PM, Yoni Lamri <lamri.yoni(a)gmail.com> wrote:
Hello everybody,
First thread in the list, i'm relatively new to wikipedia/media usage (1
month).
Welcome!
I followed the doc (RTFM as usual) and every steps
seems to finish in a
wall.
My simple question, how to correctly install a wikipedia mirror from dumps
in mediawiki ?
My goal:
Create an offline wikimedia server, from FR, EN or PT dumps, (1 language
only).
I did some tests with the wowiki which is small enough for testing purpose
but I encountered problems:
The tools mentioned in the doc are 404 for most of them or outdated. for
example the xml2sql or mwdumper.
the original importer of wikimedia takes 20 minutes to import the wowiki
1.6Mo dump which means approximatly a decade for the FR one...
I don't think the time taken by the import script provided in MediaWiki is
directly proportional to the size of the dump, as there are also other
factors. You can perhaps try to use a larger dump (perhaps simplewiki or
even metawiki) to see the difference.
I tried with the last MWDumper found on github, it can
quickly generate a
good sql file but it seems that special characters in page text are not
"slashed" so \n return SQL error...
I am not sure if this is related to
https://phabricator.wikimedia.org/T16379
Some columns are missing in original mediawiki
installer, the
"page_counter" in "page" table for example, is there some needed
extensions
to install dumps?
The "page_counter" column has already been removed in MediaWiki 1.25. You
can obtain it as an extension here:
https://www.mediawiki.org/wiki/Extension:HitCounters
How can i found/get medias (images low res are
enough)
Media dumps are currently not available due to disk space issues. There is
a possibility that these dumps will be produced in the future, and that
will be announced on this mailing list.
How to do with interwiki medias/links/pages?
You will have to set up your interwiki table using this script:
https://git.wikimedia.org/blob/operations%2Fdumps.git/adb99633787b188dc78ac…
Note that there is a small bug in that script that causes the interwiki
file to not be downloaded correctly, you would have to apply this patch to
get it working:
https://gerrit.wikimedia.org/r/220075
To follow next step will be updates procedure, incremental or complete...
Seems french wikimedia team are stuck with my questions..
I'm not closed to development (python, PHP or JEE) but i need an entry
point to begin if nobody work on it...
Thank you for your work, i know you are quite busy with the dump
generation but our project is quite serious and lots of people want it and
we need quick answers for the feasability.
Best regards,
Yoni
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Hope this helps.
--
Best regards,
Hydriz Scholz