Manuel Schneider wrote:
Hi Thomasz,
thanks for your answer.
I was thinking of a wiki on a server which is not part of the cluster but configured the same way as the productive Wikimedia wikis. Especially the updates of the software is the issue, so the installation is always suitable for Wikimedia content (Mediawiki version, extensions etc.). Then we need the possibility to maintain our own skin and our own version of wiki2html on that server (I think both should become part of Mediawiki anyway). The initial idea was that we use it as a blank setup where we can import the latest dump of whatever wiki we want to convert to ZIM. But if a database connection to the Wikimedia cluster would be possible this would boost the whole process a lot.
If you wanted a live and continuous connection to our db's then you would be changing content like any other user and would be held to the same editing standards. There is no use wiki db live and take edits but not have it propagate through the rest of the system.
Really it sounds like you want a sandbox to play with that's reasonably up to date relative to our cluster with a initial dump of a wikipedia snapshot. That could either be pulled from a one time slave or the xml dump.
But after thinking through all this I am at the point where I see the big solution: Put the zimwriter into Mediawiki / the Wikimedia dumps and we're set...
Yup, that's the simplest solution by far if its stable enough. Let me see how easy it is to plug into one of the current dumps.
What do you think?
At the moment the DVD for LinuxTag in June is priority and the zimwriter is still work in progress, but we need a solution for current Wikipedia dumps.
How stable would you consider it? Does it support UTF encoding across the board? Any performance impacts that I should know about?
-tomasz