Hi! I made a XML dump with --current option, then replaced some of external domain links [http://somedomain.org] in it. When I import the dump back, these pages aren't updated. I think that's because text processors / editors do not update sha1 / timestamp fields. Why doesn't maintenance/importDump.php recalculate and compare sha1 of actual page content? How should I touch timestamp / sha1 xml field text in the modified dump? Is there any ready solution? Or, shall I use pywikibot instead (that will be longer and slower)? Dmitriy
If you simply remove the timestamp from a revision in a dump, the importer appears to happily insert it with the current time as the timestamp. This may also cause cancer, summon Cthulhu, etc.
In addition to pywikibot, there's the Replace Text extension[0], which ought to be able to handle what you want to do.
Replace in XML dump was done automatically and I did not want to remove timestamp from all revisions (even from current ones) because the wiki is quite large (about 9000 articles). So I made my own tool to touch such revision timestamps: https://github.com/Dmitri-Sintsov/MwDumpProcessor/commit/079cc194215632db3e8... It's not a complete solution (no real parser, no support for extra fields, such as LiquidThread inserts into dump) however enough for ordinary NS_MAIN pages. It's strange that dump importer itself does not compare base36sha1, neither for warning of altered content nor to import manually altered revisions only. Dmitriy
On Mon, Aug 11, 2014 at 5:14 AM, Benjamin Lees emufarmers@gmail.com wrote:
If you simply remove the timestamp from a revision in a dump, the importer appears to happily insert it with the current time as the timestamp. This may also cause cancer, summon Cthulhu, etc.
In addition to pywikibot, there's the Replace Text extension[0], which ought to be able to handle what you want to do.
[0] https://www.mediawiki.org/wiki/Extension:Replace_Text _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org