Bugs item #3539176, was opened at 2012-06-30 10:50 Message generated for change (Comment added) made by You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3539176...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: ToAruShiroiNeko () Assigned to: Nobody/Anonymous (nobody) Summary: archivebot.py doesn't support unicode month names
Initial Comment: archivebot.py doesn't work well with languages such as Turkish which has some months with unicode characters. Namely:
2 Şubat 4 Mayıs 8 Ağustos 9 Eylül 11 Kasım 12 Aralık
----------------------------------------------------------------------
Comment By: ToAruShiroiNeko ()
Date: 2012-07-01 12:59
Message: Oh when I ran the bot initially without -l turkish it ignored all threads. Since it already archived 3 of the 6 initial threads it is still reporting 0 Threads as it cannot see the ones with "Mayıs" month name.
----------------------------------------------------------------------
Comment By: ToAruShiroiNeko () Date: 2012-07-01 12:57
Message: Sure. There is no traceback error for me to provide though since the code does work, it just ignores some threads.
Run1: archivebot.py -l turkish Archive/config Fetching template transclusions... Getting references to [[Sablon:Archive/config]] via API... Processing [[tr:Kullanici mesaj:??????]] 3 Threads found on [[tr:Kullanici mesaj:??????]] Looking for: {{Archive/config}} in [[tr:Kullanici mesaj:??????]] Processing 3 threads There are only 0 Threads. Skipping
Run2: archivebot.py Archive/config Fetching template transclusions... Getting references to [[Sablon:Archive/config]] via API... Processing [[tr:Kullanici mesaj:??????]] 3 Threads found on [[tr:Kullanici mesaj:??????]] Looking for: {{Archive/config}} in [[tr:Kullanici mesaj:??????]] Processing 3 threads There are only 0 Threads. Skipping
Note the Turkish character ı is displayed as i in the CMD window (I run code using Windows). The ???? relate to my user talk page http://tr.wikipedia.org/wiki/Kullan%C4%B1c%C4%B1_mesaj:%E3%81%A8%E3%81%82%E3... but CMD cannot display unicode.
----------------------------------------------------------------------
Comment By: xqt (xqt) Date: 2012-07-01 06:45
Message: Could you give us a traceback or further informations about that bug? The bot uses the monthnames coming from mediaWiki messages and I don't know what is the significance of the locale setting. Could you try to run the bot without --locale=tr setting?
----------------------------------------------------------------------
Comment By: ToAruShiroiNeko () Date: 2012-06-30 10:58
Message: Command line I used was archivebot.py -l turkish Archive/config
----------------------------------------------------------------------
Comment By: ToAruShiroiNeko () Date: 2012-06-30 10:55
Message: Pywikipedia [http] trunk/pywikipedia (r10432, 2012/06/30, 15:47:55) Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] config-settings: use_api = True use_api_login = True unicode test: ok
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3539176...
pywikipedia-bugs@lists.wikimedia.org