pywikibot March 2011

pywikibot@lists.wikimedia.org

12 participants
17 discussions

[Pywikipedia-l] CodeReview is down again
by info＠gno.de 09 Mar '11

09 Mar '11

MW code review is down again and the last commits (r9019 - 9021) are lost. Thus I reopened bugzilla #24270. Greetings xqt

3 3

[Pywikipedia-l] replace.py -catr: max depth error
by Bináris 09 Mar '11

09 Mar '11

Hi folks! I create pages for Hungarian Wikipedia like http://hu.wikipedia.org/wiki/Wikipédia:Kért_cikkek/fr, http://hu.wikipedia.org/wiki/Wikipédia:Kért_cikkek/en etc. These collect Hungary-related articles from other Wikipedias that have no Hungarian interwiki. Either they must be supplied with an iw or they are a good idea to write new articles. First I collect all the pages with replace.py, then I upload them and process the list with a newly developed script which I will soon offer for Pywikipedia because it can be used in other Wikipedias. Itt successfully ran in en, fr, ro wikis but stopped in eswiki. My command: *replace.py -catr:Hungría . @ -lang:es -excepttext:"[[hu:" -savenew:magyarok.txt -always* The error message follows here. As far as I understand it comes from Python rather than pywiki, but could we somehow handle it? File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 167, in _getContentsNaive for item in page._getContentsNaive(recurse=True): File "C:\Program Files\Pywikipedia\catlib.py", line 164, in _getContentsNaive for tag, page in self._parseCategory(startFrom=startFrom): File "C:\Program Files\Pywikipedia\catlib.py", line 215, in _parseCategory data = query.GetData(params, self.site()) File "C:\Program Files\Pywikipedia\query.py", line 132, in GetData jsontext = json.loads( jsontext ) File "C:\Program Files\Pywikipedia\simplejson\__init__.py", line 262, in loads return _default_decoder.decode(s) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 251, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 268, in raw_de code obj, end = self._scanner.iterscan(s, **kw).next() File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in itersca n rval, next_pos = action(m, context) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 138, in JSONOb ject value, end = iterscan(s, idx=end, context=context).next() File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in itersca n rval, next_pos = action(m, context) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 138, in JSONOb ject value, end = iterscan(s, idx=end, context=context).next() File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in itersca n rval, next_pos = action(m, context) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 171, in JSONAr ray value, end = iterscan(s, idx=end, context=context).next() File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in itersca n rval, next_pos = action(m, context) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 138, in JSONOb ject value, end = iterscan(s, idx=end, context=context).next() File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in itersca n rval, next_pos = action(m, context) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 113, in JSONSt ring return scanstring(match.string, match.end(), encoding) File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 85, in scanstr ing if terminator == '"': RuntimeError: maximum recursion depth exceeded in cmp maximum recursion depth exceeded in cmp 935 titles were saved. -- Bináris

3 8

Re: [Pywikipedia-l] Encoding in HTML source
by Bináris 07 Mar '11

07 Mar '11

I send this back to the list: 2011/3/7 Andre Engels <andreengels(a)gmail.com> > I don't know about that, but I think you can work the other way > around, using a bit of regular expression magic: > > import re > ... > existing = [wikipedia.Page(wikipedia.getSite(), pname).title() for > pname in re.findall(r"title=(.*?)&action=edit", fullsourcetext)] > > def exists(page): > return page.title() in existing > This works fine! I didn't know that titles could be encoded in Page(). There are already some regexes in my code. Thank you! -- Bináris

2 3

[Pywikipedia-l] Encoding in HTML source
by Bináris 07 Mar '11

07 Mar '11

Hi, when I download a page in HTML, which contains titles of articles, these titles are something like urlencode()-ed, but not quite; characters like "(", ")", "!", ",", ":" appear without encoding. For example: <li><a href="/w/index.php?title=Avant_l%27aurore_*(*court-m%C3%A9trage*)*&action=edit&redlink=1" class="new" title="Avant l'aurore (court-métrage) (page does not exist)">Avant l'aurore (court-métrage)</a></li> Is there a function in pywiki to handle this, or is there available a full list of non-encoded characters? I used urlencode() + a dict of known exceptions, but this is not the best solution. -- Bináris

3 4

Re: [Pywikipedia-l] [Pywikipedia-svn] SVN: [9018] trunk/pywikipedia/interwiki.py
by Merlijn van Deen 06 Mar '11

06 Mar '11

On 2 March 2011 07:46, <xqt(a)svn.wikimedia.org> wrote: > Log Message: > ----------- > hak-wiki is also affected due to bug #3081100 > > Modified: trunk/pywikipedia/interwiki.py > =================================================================== > - rmPage.site().lang in ['hi', 'cdo'] and \ > + rmPage.site().lang in ['hak', 'hi', 'cdo'] and \ > Note that *any* wiki can be affected - it's a problem with certain combinations of characters (multiple accents etc). This only commonly happens in some languages, but in principle any language that uses several accents can be affected. Not quite sure what the correct way of handling this would be though -- blocking all bots that trigger the unicode bug might be a bit too much. Best regards, Merlijn

2 2

[Pywikipedia-l] SVN help
by Bináris 04 Mar '11

04 Mar '11

Hi, I have just installed TortoiseSVN on my machine, and I am trying to understand it from documentation. I need it for -- restoring the older versions of my misdeveloped programs (after some sad experiences) -- making diffs/patches to upload onto SF -- update Pywikipedia (by this time I used nightlies for this purpose) As far as I understand, I need an own repository here for the first goal. I thought I would synchronize my working copy (which should be the active pywikibot) with my local repository first and then update it from Pywiki repository. Is this a good concept? What will happen to the rev numbers this way? Will they confuse? Or how do you solve this (those who develop and not only use the bot)? -- Bináris

3 2

[Pywikipedia-l] replace.py -silent -- please comment
by Bináris 03 Mar '11

03 Mar '11

I am just running replace.py -catr:%D0%92%D0%B5%D0%BD%D0%B3%D1%80%D0%B8%D1%8F . @ -lang:ru -excepttext:"[[hu:" -save:magyarok.txt -always to collect Hungary-related articles from Russian Wikipedia. This stuff has already been running for 10 hours mostly because of pywikibot.output. When the overwhelming majority of characters appears as yellow (substituted), this function slows down the program extremely. I see it writing text on my screen character by character. This gives me the idea to introduce a new switch: -silent. >From line 474 on # Show the title of the page we're working on. # Highlight the title in purple. pywikibot.output(u"\n\n>>> \03{lightpurple}%s\03{default} <<<" % page.title()) pywikibot.showDiff(original_text, new_text) should not be executed with this switch on. This is recommended only for such cases as mentioned above. Because this can be dangereous, there could be a restriction that allows this switch to work only together with -always. OR, an even more restrictive rule: it can work only with -save/savenew (one could argue that any work on live wiki should appear on the screen). Which one is better? -- Bináris

2 3

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot March 2011