Bugs item #1771889, was opened at 2007-08-10 17:20 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1771889...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Falk Steinhauer (falk_steinhauer) Assigned to: Nobody/Anonymous (nobody) Summary: Problems with namespaces in wikipedia.py
Initial Comment: I am using snapshot 2007-06-19:
In our wiki we are using title prefixes for articles that are not in german. They are Fr: (French) and En: (Englisch).
One of our French articles marks the end of a subarticle of [[Special:All Pages]] (see here: http://www.wiki-aventurica.de/index.php?title=Spezial:Alle_Seiten)
If I am using commandline option -start:! the script runs into a recursion. After Fr:xxxx is yielded the script whishes to continue with article xxxx, which is in my case alphabetically before Fr:xxxx. You can see, that this leads to a recursion. If xxxx is after Fr:xxxx, some articles might be skipped.
I detected the reponsible line of code: wikipedia.py line 3504 # save the last hit, so that we know where to continue when we # finished all articles on the current page. Append a '!' so that # we don't yield a page twice. start = Page(self,hit).titleWithoutNamespace() + '!'
Maybe this can also be fixed in titleWithoutNamespace()
Is it necessary to cut off the namespace?
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1771889...