Hi folks!
I create pages for Hungarian Wikipedia like
http://hu.wikipedia.org/wiki/Wikipédia:Kért_cikkek/fr,
http://hu.wikipedia.org/wiki/Wikipédia:Kért_cikkek/en etc. These collect
Hungary-related articles from other Wikipedias that have no Hungarian
interwiki. Either they must be supplied with an iw or they are a good idea
to write new articles.
First I collect all the pages with replace.py, then I upload them and
process the list with a newly developed script which I will soon offer for
Pywikipedia because it can be used in other Wikipedias.
Itt successfully ran in en, fr, ro wikis but stopped in eswiki.
My command:
*replace.py -catr:Hungría . @ -lang:es -excepttext:"[[hu:"
-savenew:magyarok.txt -always*
The error message follows here. As far as I understand it comes from Python
rather than pywiki, but could we somehow handle it?
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 167, in
_getContentsNaive
for item in page._getContentsNaive(recurse=True):
File "C:\Program Files\Pywikipedia\catlib.py", line 164, in
_getContentsNaive
for tag, page in self._parseCategory(startFrom=startFrom):
File "C:\Program Files\Pywikipedia\catlib.py", line 215, in _parseCategory
data = query.GetData(params, self.site())
File "C:\Program Files\Pywikipedia\query.py", line 132, in GetData
jsontext = json.loads( jsontext )
File "C:\Program Files\Pywikipedia\simplejson\__init__.py", line 262, in
loads
return _default_decoder.decode(s)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 251, in
decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 268, in
raw_de
code
obj, end = self._scanner.iterscan(s, **kw).next()
File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in
itersca
n
rval, next_pos = action(m, context)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 138, in
JSONOb
ject
value, end = iterscan(s, idx=end, context=context).next()
File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in
itersca
n
rval, next_pos = action(m, context)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 138, in
JSONOb
ject
value, end = iterscan(s, idx=end, context=context).next()
File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in
itersca
n
rval, next_pos = action(m, context)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 171, in
JSONAr
ray
value, end = iterscan(s, idx=end, context=context).next()
File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in
itersca
n
rval, next_pos = action(m, context)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 138, in
JSONOb
ject
value, end = iterscan(s, idx=end, context=context).next()
File "C:\Program Files\Pywikipedia\simplejson\scanner.py", line 50, in
itersca
n
rval, next_pos = action(m, context)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 113, in
JSONSt
ring
return scanstring(match.string, match.end(), encoding)
File "C:\Program Files\Pywikipedia\simplejson\decoder.py", line 85, in
scanstr
ing
if terminator == '"':
RuntimeError: maximum recursion depth exceeded in cmp
maximum recursion depth exceeded in cmp
935 titles were saved.
--
Bináris
I send this back to the list:
2011/3/7 Andre Engels <andreengels(a)gmail.com>
> I don't know about that, but I think you can work the other way
> around, using a bit of regular expression magic:
>
> import re
> ...
> existing = [wikipedia.Page(wikipedia.getSite(), pname).title() for
> pname in re.findall(r"title=(.*?)&action=edit", fullsourcetext)]
>
> def exists(page):
> return page.title() in existing
>
This works fine! I didn't know that titles could be encoded in Page().
There are already some regexes in my code. Thank you!
--
Bináris
Hi,
when I download a page in HTML, which contains titles of articles, these
titles are something like urlencode()-ed, but not quite; characters like
"(", ")", "!", ",", ":" appear without encoding.
For example:
<li><a href="/w/index.php?title=Avant_l%27aurore_*(*court-m%C3%A9trage*)*&action=edit&redlink=1"
class="new" title="Avant l'aurore (court-métrage) (page does not
exist)">Avant l'aurore (court-métrage)</a></li>
Is there a function in pywiki to handle this, or is there available a full
list of non-encoded characters? I used urlencode() + a dict of known
exceptions, but this is not the best solution.
--
Bináris
On 2 March 2011 07:46, <xqt(a)svn.wikimedia.org> wrote:
> Log Message:
> -----------
> hak-wiki is also affected due to bug #3081100
>
> Modified: trunk/pywikipedia/interwiki.py
> ===================================================================
> - rmPage.site().lang in ['hi', 'cdo'] and \
> + rmPage.site().lang in ['hak', 'hi', 'cdo'] and \
>
Note that *any* wiki can be affected - it's a problem with certain
combinations of characters (multiple accents etc). This only commonly
happens in some languages, but in principle any language that uses several
accents can be affected. Not quite sure what the correct way of handling
this would be though -- blocking all bots that trigger the unicode bug might
be a bit too much.
Best regards,
Merlijn
Hi,
I have just installed TortoiseSVN on my machine, and I am trying to
understand it from documentation.
I need it for
-- restoring the older versions of my misdeveloped programs (after some sad
experiences)
-- making diffs/patches to upload onto SF
-- update Pywikipedia (by this time I used nightlies for this purpose)
As far as I understand, I need an own repository here for the first goal. I
thought I would synchronize my working copy (which should be the active
pywikibot) with my local repository first and then update it from Pywiki
repository.
Is this a good concept?
What will happen to the rev numbers this way? Will they confuse?
Or how do you solve this (those who develop and not only use the bot)?
--
Bináris
I am just running
replace.py -catr:%D0%92%D0%B5%D0%BD%D0%B3%D1%80%D0%B8%D1%8F . @ -lang:ru
-excepttext:"[[hu:" -save:magyarok.txt -always to collect Hungary-related
articles from Russian Wikipedia.
This stuff has already been running for 10 hours mostly because of
pywikibot.output. When the overwhelming majority of characters appears as
yellow (substituted), this function slows down the program extremely. I see
it writing text on my screen character by character.
This gives me the idea to introduce a new switch: -silent.
>From line 474 on
# Show the title of the page we're working on.
# Highlight the title in purple.
pywikibot.output(u"\n\n>>> \03{lightpurple}%s\03{default} <<<"
% page.title())
pywikibot.showDiff(original_text, new_text)
should not be executed with this switch on. This is recommended only for
such cases as mentioned above.
Because this can be dangereous, there could be a restriction that allows
this switch to work only together with -always.
OR, an even more restrictive rule: it can work only with -save/savenew (one
could argue that any work on live wiki should appear on the screen).
Which one is better?
--
Bináris