Manuel Schneider wrote:
Hi Thomasz,
thanks for your answer.
I was thinking of a wiki on a server which is not part of the cluster but configured the
same way as the productive Wikimedia wikis. Especially the updates of the software is the
issue, so the installation is always suitable for Wikimedia content (Mediawiki version,
extensions etc.).
Then we need the possibility to maintain our own skin and our own version of wiki2html on
that server (I think both should become part of Mediawiki anyway).
The initial idea was that we use it as a blank setup where we can import the latest dump
of whatever wiki we want to convert to ZIM. But if a database connection to the Wikimedia
cluster would be possible this would boost the whole process a lot.
If you wanted a live and continuous connection to our db's then you
would be changing content like any other user and would be held to the
same editing standards. There is no use wiki db live and take edits but
not have it propagate through the rest of the system.
Really it sounds like you want a sandbox to play with that's reasonably
up to date relative to our cluster with a initial dump of a wikipedia
snapshot. That could either be pulled from a one time slave or the xml dump.
But after thinking through all this I am at the point where I see the big solution: Put
the zimwriter into Mediawiki / the Wikimedia dumps and we're set...
Yup, that's the simplest solution by far if its stable enough. Let me
see how easy it is to plug into one of the current dumps.
What do you think?
At the moment the DVD for LinuxTag in June is priority and the zimwriter is still work in
progress, but we need a solution for current Wikipedia dumps.
How stable would you consider it? Does it support UTF encoding across
the board? Any performance impacts that I should know about?
-tomasz