Bugs item #3414669, was opened at 2011-09-27 21:50 Message generated for change (Comment added) made by valhallasw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3414669...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: interwiki Group: None Status: Open Resolution: None Priority: 8 Private: No Submitted By: hiw (hiw) Assigned to: Nobody/Anonymous (nobody) Summary: interwiki.py removing page text
Initial Comment: Following edit on NL-disamb. page, the page got emptied, only the interwiki link remained. Interwiki.py should not have touched the page in the first place, since the interwiki link was already set earlier.
Diff-link: http://nl.wikipedia.org/w/index.php?title=Blankenbach&diff=next&oldi...
Active Python on Microsoft Windows XP [Version 5.1.2600]
Pywikipedia [http] trunk/pywikipedia (r9558, 2011/09/25, 20:30:54) Python 2.7.2 (default, Jun 24 2011, 12:21:10) [MSC v.1500 32 bit (Intel)] config-settings: use_api = True use_api_login = True unicode test: ok
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-09-29 21:34
Message: Confirmed on eowiki, 25 suspected pages http://eo.wikipedia.org/w/index.php?title=Anton%C3%ADn_Kl%C3%A1%C5%A1tersk%C...
Confirmed on simplewiki, 3 suspected pages
itwiki: no results ptwiki: no results dewiki: no results frwiki: results, but all from the same antivandalism bot
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-29 21:14
Message: Using the following query to find suspected edits... select rc_cur_time, rc_user, rc_namespace, rc_title, rc_old_len, rc_new_len from recentchanges left join user_groups on ug_user=rc_user where rc_new_len < rc_old_len * 0.1 and ug_group = 'bot' and rc_namespace=0; (note: this will not find *all* bad edits, but at least some)...
http://nl.wikipedia.org/w/index.php?title=Alexander_Gottfried&diff=prev&... http://nl.wikipedia.org/w/index.php?title=Angerapp&diff=27329689&old... http://nl.wikipedia.org/w/index.php?title=Partjessnijder&diff=27331463&a... http://nl.wikipedia.org/w/index.php?title=Atax&diff=27330470&oldid=1... http://nl.wikipedia.org/w/index.php?title=Medinilla&diff=27328890&ol... http://nl.wikipedia.org/w/index.php?title=Merklin&diff=27330198&oldi... http://nl.wikipedia.org/w/index.php?title=Pion&diff=27327730&oldid=1... http://nl.wikipedia.org/w/index.php?title=Vossenplein&diff=27327943&... http://nl.wikipedia.org/w/index.php?title=Walser&diff=27329293&oldid...
so.. at least the specificity is good, even if the sensitivity is not. I'll try and see what happens on different wikis. Hopefully this will give some hint whether it's 1.18 related or not.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-29 20:39
Message: At the moment: no. In theory, the special:export function could probably be replaced by one or more API calls, but I have no reason to assume this actually solves the problem...
----------------------------------------------------------------------
Comment By: hiw (hiw) Date: 2011-09-29 04:58
Message: Can you force the script to use API to get the page text?
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-28 21:47
Message: I did some more testing, using python interwiki.py -lang:de -page:Blankenbach%20%28Begriffskl%C3%A4rung%29 -async -cleanup -auto -async
note that these findings are not necessarily true for running on full auto...
in this setup, the bot ALWAYS uses special:export to get page text. It does use the API to write the pages. It only retrieves the pages ONCE, at the start of the run.
sigh.
----------------------------------------------------------------------
Comment By: hiw (hiw) Date: 2011-09-28 00:31
Message: Pffff, I believe it was:
interwiki.py -all -async -cleanup -log -auto -start:
I would think I used -ns:0 also, nut sure.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-27 22:53
Message: Question to both committer and myst: what was the exact command line you were using?
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-27 22:49
Message: Last note for tonight: quickly reviewing the diff to r9500 (2011-09-03) did not yield anything really change. Note: I did this in one bunch. Reviewing commits from the mailinglist one at a time might still be a good plan...
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-27 22:42
Message: Last three edits of interwiki.py are all quite old:
------------------------------------------------------------------------ r9407 | xqt | 2011-07-16 23:35:06 +0200 (Sat, 16 Jul 2011) | 1 line
trailing space for list elements (readability) ------------------------------------------------------------------------ r9387 | amir | 2011-07-16 12:05:50 +0200 (Sat, 16 Jul 2011) | 1 line
adding fa for exception templates ------------------------------------------------------------------------ r9308 | xqt | 2011-06-24 19:14:40 +0200 (Fri, 24 Jun 2011) | 1 line
do not follow static redirects which means do not change the target links like -noredirect does (with -cleanup option. -force removes that link - maybe this should be fixed) ------------------------------------------------------------------------
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw) Date: 2011-09-27 22:37
Message: This has also happened with Myst's bot on simplewiki: http://simple.wikipedia.org/w/index.php?title=Mettau%2C_Switzerland&acti...
Increasing priority, rephrased title.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=3414669...
pywikipedia-bugs@lists.wikimedia.org