Hello.
I'm not sure if this is appropriate list for such kind of question. I'll appreciate if someone directs me to the proper one.
One has task to make a local fully-functional mirror of Wikipedia sudomain (articles, images, etc. must be located on the local server). Currently there are not so many articles and downloading dump once a day may be an option. But there is a problem: how to synchronize changes made to the local copy back to the Wikipedia? Is there any piece of software that could help?
I would appreciate any help.
Sincerely, Artyom.
On Thu, Jul 09, 2009 at 12:01:32AM +0500, Artyom Sokolov wrote:
One has task to make a local fully-functional mirror of Wikipedia sudomain (articles, images, etc. must be located on the local server). Currently there are not so many articles and downloading dump once a day may be an option. But there is a problem: how to synchronize changes made to the local copy back to the Wikipedia? Is there any piece of software that could help?
Don't do that. Synchronizing back is a very difficult task, and you will find yourself in deep trouble very soon. If you don't do proper replication conflict resolution, you'll have either junk on your side or on the Wikipedia side. In the later case, you'll probably get blocked rather soon, in the other case your users will get frustrated because their edits don't get through.
Regards,
jens
On Wed, Jul 8, 2009 at 3:40 PM, Jens Frankjf@mormo.org wrote:
On Thu, Jul 09, 2009 at 12:01:32AM +0500, Artyom Sokolov wrote:
One has task to make a local fully-functional mirror of Wikipedia sudomain (articles, images, etc. must be located on the local server). Currently there are not so many articles and downloading dump once a day may be an option. But there is a problem: how to synchronize changes made to the local copy back to the Wikipedia? Is there any piece of software that could help?
Don't do that. Synchronizing back is a very difficult task, and you will find yourself in deep trouble very soon. If you don't do proper replication conflict resolution, you'll have either junk on your side or on the Wikipedia side. In the later case, you'll probably get blocked rather soon, in the other case your users will get frustrated because their edits don't get through.
Regards,
jens
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Using dumps locally for an offline version is pretty easy to set up (download the right dump, import with either importDump or mwdumper).
Syncing changes back to the live site is a Very Bad Idea, like Jens said. There is absolutely no supported mechanism to do this (see bugs 2054 and 15468).
-Chad
Don't do that. Synchronizing back is a very difficult task, and you
will
find yourself in deep trouble very soon. If you don't do proper replication conflict resolution, you'll have either junk on your side
or
on the Wikipedia side. In the later case, you'll probably get blocked rather soon, in the other case your users will get frustrated because their edits don't get through.
Regards,
jens
At one wiki site (not a Wikipedia) an improved version of Special:Ancientpages is used to manually synchrolize between localhost and webhost clone wiki sites. This version of Special:Ancientpages allows to select another exportable namespaces besides NS_MAIN and optionally submits these to Special:Export. By default, the functionality of Special:Export is very limited (at least it was last time I was cheking it).
A better approach probably would be introducing an --date option to maintenance/dumpBackup.php.
The important problem though is, that during XML import not all hooks are called and extension-specific data is not saved/restored correctly :-( I think it's major problem that these hooks doesn't allow to pass their own XML tags into the dumps. Dmitriy
On Thu, Jul 9, 2009 at 1:14 AM, Dmitriy Sintsovquestpc@rambler.ru wrote:
Don't do that. Synchronizing back is a very difficult task, and you
will
find yourself in deep trouble very soon. If you don't do proper replication conflict resolution, you'll have either junk on your side
or
on the Wikipedia side. In the later case, you'll probably get blocked rather soon, in the other case your users will get frustrated because their edits don't get through.
Regards,
jens
At one wiki site (not a Wikipedia) an improved version of Special:Ancientpages is used to manually synchrolize between localhost and webhost clone wiki sites. This version of Special:Ancientpages allows to select another exportable namespaces besides NS_MAIN and optionally submits these to Special:Export. By default, the functionality of Special:Export is very limited (at least it was last time I was cheking it).
A better approach probably would be introducing an --date option to maintenance/dumpBackup.php.
The important problem though is, that during XML import not all hooks are called and extension-specific data is not saved/restored correctly :-( I think it's major problem that these hooks doesn't allow to pass their own XML tags into the dumps. Dmitriy
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Tags like <ref> which do not appear in core are dumped. You just have to have the extension installed for them to do anything after you've imported the content. If you don't have Cite and ParserFunctions (at a minimum) installed, don't expect much of Enwiki to work after import.
That being said: I've found that having Cite and/or ParserFunctions activated *while* importing to slow down the process (and occasionally cause it to halt). It's better to import and then activate the extensions.
-Chad
-Chad
wikitech-l@lists.wikimedia.org