Hi Brad,
You mentioned "a while back" for "apcontinue", show recent was it? This dump generator is attempting to archive all sorts of versions of MediaWiki, or so unless we write a backward compatibility handler in the script itself.
...and I agree, the code is in a total mess. We need to get someone to rewrite the whole thing, soon.
On Fri, Nov 9, 2012 at 11:50 PM, Brad Jorsch bjorsch@wikimedia.org wrote:
You're searching for the continue parameter as "apfrom", but this was changed to "apcontinue" a while back. Changing line 162 to something like this should probably do it:
m = re.findall(r'<allpages (?:apfrom|apcontinue)="([^>]+)" />', xml)
Note that for full correctness, you probably should omit both apfrom and apcontinue entirely from params the first time around, and send back whichever of the two is found by the above line in subsequent queries.
Also, why in the world aren't you using an XML parser (or a JSON parser with format=json) to process the API response instead of trying to parse the XML using regular expressions?!
On Fri, Nov 9, 2012 at 2:27 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
It's completely broken: https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api