Hi Thomasz,
thanks for your answer.
I was thinking of a wiki on a server which is not part of the cluster but configured the same way as the productive Wikimedia wikis. Especially the updates of the software is the issue, so the installation is always suitable for Wikimedia content (Mediawiki version, extensions etc.). Then we need the possibility to maintain our own skin and our own version of wiki2html on that server (I think both should become part of Mediawiki anyway). The initial idea was that we use it as a blank setup where we can import the latest dump of whatever wiki we want to convert to ZIM. But if a database connection to the Wikimedia cluster would be possible this would boost the whole process a lot.
But after thinking through all this I am at the point where I see the big solution: Put the zimwriter into Mediawiki / the Wikimedia dumps and we're set...
What do you think?
At the moment the DVD for LinuxTag in June is priority and the zimwriter is still work in progress, but we need a solution for current Wikipedia dumps.
Greets,
Manuel
-- Urspr. Mitt. -- Betreff: Re: [openZIM dev-l] Wikimedia-updated MediaWiki instance for dumping Von: Tomasz Finc tfinc@wikimedia.org Datum: 20.04.2009 22:29
Manuel Schneider wrote:
Dear Brion,
is it possible that we can have a MediaWiki instance which is being kept up-to-date by the regular synchronisation on the WM cluster, so we can import all types of dumps in there to make ZIM files from it?
We will use a modified version of wiki2html and a special skin for the dumps, so we need an extra instance but it should be always up-to-date with the Wikimedia projects. This would be extremely helpful for us.
Any ideas?
Greets,
Manuel
Filling in since Brion is out in meetings today.
I'm a little confused with what you are asking here.
Are you requesting a machine that is configured the same as any one of our web server nodes to serve wikipedia traffic and connects to our databases but is not part of our general cluster? Or are you asking for a blank node to populate with different types of data that you can in turn generate ZIM files from?
Let know and we can get something in place.
--tomasz
_______________________________________________ dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
Manuel Schneider wrote:
Hi Thomasz,
thanks for your answer.
I was thinking of a wiki on a server which is not part of the cluster but configured the same way as the productive Wikimedia wikis. Especially the updates of the software is the issue, so the installation is always suitable for Wikimedia content (Mediawiki version, extensions etc.). Then we need the possibility to maintain our own skin and our own version of wiki2html on that server (I think both should become part of Mediawiki anyway). The initial idea was that we use it as a blank setup where we can import the latest dump of whatever wiki we want to convert to ZIM. But if a database connection to the Wikimedia cluster would be possible this would boost the whole process a lot.
If you wanted a live and continuous connection to our db's then you would be changing content like any other user and would be held to the same editing standards. There is no use wiki db live and take edits but not have it propagate through the rest of the system.
Really it sounds like you want a sandbox to play with that's reasonably up to date relative to our cluster with a initial dump of a wikipedia snapshot. That could either be pulled from a one time slave or the xml dump.
But after thinking through all this I am at the point where I see the big solution: Put the zimwriter into Mediawiki / the Wikimedia dumps and we're set...
Yup, that's the simplest solution by far if its stable enough. Let me see how easy it is to plug into one of the current dumps.
What do you think?
At the moment the DVD for LinuxTag in June is priority and the zimwriter is still work in progress, but we need a solution for current Wikipedia dumps.
How stable would you consider it? Does it support UTF encoding across the board? Any performance impacts that I should know about?
-tomasz
Pinging for updates...
Did we poke something at this? Anything else needed?
-- brion
El 4/20/09 4:27 PM, Tomasz Finc escribió:
Manuel Schneider wrote:
Hi Thomasz,
thanks for your answer.
I was thinking of a wiki on a server which is not part of the cluster but configured the same way as the productive Wikimedia wikis. Especially the updates of the software is the issue, so the installation is always suitable for Wikimedia content (Mediawiki version, extensions etc.). Then we need the possibility to maintain our own skin and our own version of wiki2html on that server (I think both should become part of Mediawiki anyway). The initial idea was that we use it as a blank setup where we can import the latest dump of whatever wiki we want to convert to ZIM. But if a database connection to the Wikimedia cluster would be possible this would boost the whole process a lot.
If you wanted a live and continuous connection to our db's then you would be changing content like any other user and would be held to the same editing standards. There is no use wiki db live and take edits but not have it propagate through the rest of the system.
Really it sounds like you want a sandbox to play with that's reasonably up to date relative to our cluster with a initial dump of a wikipedia snapshot. That could either be pulled from a one time slave or the xml dump.
But after thinking through all this I am at the point where I see the big solution: Put the zimwriter into Mediawiki / the Wikimedia dumps and we're set...
Yup, that's the simplest solution by far if its stable enough. Let me see how easy it is to plug into one of the current dumps.
What do you think?
At the moment the DVD for LinuxTag in June is priority and the zimwriter is still work in progress, but we need a solution for current Wikipedia dumps.
How stable would you consider it? Does it support UTF encoding across the board? Any performance impacts that I should know about?
-tomasz _______________________________________________ dev-l mailing list dev-l@openzim.org https://intern.openzim.org/mailman/listinfo/dev-l
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Manuel Schneider a écrit :
I was thinking of a wiki on a server which is not part of the cluster but configured the same way as the productive Wikimedia wikis. Especially the updates of the software is the issue, so the installation is always suitable for Wikimedia content (Mediawiki version, extensions etc.).
I manage my Mediawiki installs with the help of 3 scripts: *http://kiwix.svn.sourceforge.net/viewvc/kiwix/mirroring_tools/scripts/mirror... *http://kiwix.svn.sourceforge.net/viewvc/kiwix/mirroring_tools/scripts/instal... *http://kiwix.svn.sourceforge.net/viewvc/kiwix/mirroring_tools/scripts/mirror...
They do not do 100% of the work but give you the ability in 10 minutes to have a Mediawiki instance which is almost the same a the one you want to mirror.
Emmanuel