Bugs item #2061186, was opened at 2008-08-20 03:26 Message generated for change (Comment added) made by wikipedian You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2061186...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 8 Private: No Submitted By: Woo-Jin Kim (kwj2772) Assigned to: Nobody/Anonymous (nobody) Summary: interwiki.py doesn't work well
Initial Comment: I have found serious bug on interwiki.py I'm using Pywikipedia revision 5816.
Error message: C:\Python25\pywikipedia\interwiki.py -autonomous -lang:en -start:! Checked for running processes. 1 process currently running, including currently process. NOTE:Number of pages queued is 0, trying to add 60 more. Retreiving Allpages special page for wikipedia:en from %21, namespace 0 NOTE:Nothing to left to do
Why nothing to left to do?
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2008-08-20 11:43
Message: Logged In: YES user_id=880694 Originator: NO
The problem now is: In the new version of Special:Allpages, you need not only a "from" parameter, but also a "to" parameter to get page titles. This makes things really complicated for us, as we first need to find out the title of the "to" page.
The following strategy could work:
* The user runs interwiki.py -start:Foo * The bot loads http://en.wikipedia.org/w/index.php?title=Special%3AAllPages&from=Foo * Using regular expressions or BeautifulSoap, the bot searches for lines like "Foo to Forbes Fictional 15", and follows the links. * http://en.wikipedia.org/w/index.php?title=Special:AllPages&from=Foo&... is split again, so we need to do this recursively. * Now http://en.wikipedia.org/w/index.php?title=Special:AllPages&from=Foo&... is a normal page that we can parse with our existing code.
The nicer solution, of course, would be to add a "limit" parameter to Special:Allpages, so that http://en.wikipedia.org/w/index.php?title=Special%3AAllPages&from=Foo&am... would give us the page in a useful format. But from what Andre wrote, I guess that begging the MediaWiki devs for that won't have any effects.
----------------------------------------------------------------------
Comment By: Andre Engels (a_engels) Date: 2008-08-20 10:56
Message: Logged In: YES user_id=843018 Originator: NO
Yes, Special:Allpages has changed, and we already got the "fuck off, we don't care about your framework" response when complaining.
----------------------------------------------------------------------
Comment By: Multichill (multichill) Date: 2008-08-20 10:50
Message: Logged In: YES user_id=1777493 Originator: NO
I first noticed this last night. I have this problem on different systems (WinXP/BSD/Linux), all running the latest version. I noticed it with imageuncat.py, but interwiki.py didnt work either. I doesnt seem to matter which wiki you want to work on (nl and commons both didnt work).
Brion told me that Special:AllPages changed recently.
python version.py Pywikipedia [http] trunk/pywikipedia (r5819, Aug 20 2008, 08:09:06) Python 2.5.2 (r252:60911, Aug 14 2008, 13:31:58) [GCC 4.3.1]
python imageuncat.py -start:Image:AChecked for running processes. 1 processes currently running, including the current process. Retrieving Allpages special page for commons:commons from A, namespace 6 <done>
----------------------------------------------------------------------
Comment By: Mikko Silvonen (silvonen) Date: 2008-08-20 07:15
Message: Logged In: YES user_id=127947 Originator: NO
Yep, the allpages method in wikipedia.py doesn't seem to find any pages, so the -start parameter doesn't work at all. Has the format of Special:Allpages changed, or what is causing this problem? But now it's time for my day job...
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2061186...