pywikibot February 2008

pywikibot@lists.wikimedia.org

22 participants
382 discussions

[Pywikipedia-l] Fwd: How do I make interwiki.py go over all the pages in a given namespace?
by Ævar Arnfjörð Bjarmason 12 Mar '08

12 Mar '08

---------- Forwarded message ---------- From: <pywikipediabot-users-owner(a)lists.sourceforge.net> Date: Wed, Feb 27, 2008 at 8:03 PM Subject: How do I make interwiki.py go over all the pages in a given namespace? To: avarab(a)gmail.com You are not allowed to post to this mailing list, and your message has been automatically rejected. If you think that your messages are being rejected in error, contact the mailing list owner at pywikipediabot-users-owner(a)lists.sourceforge.net. ---------- Forwarded message ---------- From: "Ævar Arnfjörð Bjarmason" <avarab(a)gmail.com> To: pywikipediabot-users(a)lists.sourceforge.net Date: Wed, 27 Feb 2008 20:03:50 +0000 Subject: How do I make interwiki.py go over all the pages in a given namespace? I was trying to go over all the pages in the category namespace, I couldn't get -namespace:14 or -namespace:Category, -namespace:Category:'!' or -namespace:Flokkur (this was on iswiki) to work. I ended up running it as `python interwiki.py -autonomous -skipauto -namespace:14 -continue' with the following hack, this seems to be working but is there a proper way to do this? Index: interwiki.py =================================================================== --- interwiki.py (revision 5080) +++ interwiki.py (working copy) @@ -1620,7 +1620,10 @@ except NameError: wikipedia.output(u"Dump file is empty?! Starting at the beginning.") nextPage = "!" - namespace = 0 + if namespaces: + namespace = namespaces[0] + else: + namespace = 0 # old generator is used up, create a new one hintlessPageGen = pagegenerators.CombinedPageGenerator([pagegenerators.TextfilePageGenerator(dumpFileName),\ pagegenerators.AllpagesPageGenerator(nextPage, namespace, includeredirects = False)])

1 1

[Pywikipedia-l] write access
by marvus py 05 Mar '08

05 Mar '08

Hello I use pywikipediabot for years and i've written some patches for it. Since i get fresh version from svn i have some annoying conflicts and .mine files spamming my bot directory. I would like to commit my work, is it here the place for asking such permission ? In attached file an example of the work i've done on fixing_redirects.py : before the fixes the script was bogus on ~5% of pages (case artifacts, image and categories not handled...), after the fixes, it was 100% success from "-start:!" to end of allpages (yeah i'm quite perfectionist so i like spotting *every* link leading to a redirect on the wiki i work on : fr.wikibooks where there is 5000+ pages). I've also fixed several bugs listed on sourceforge bug tracking system. Could i also have the permission to tag them as closed ? (maybe should i ask elsewhere ?) Marvus

3 4

[Pywikipedia-l] [ pywikipediabot-Feature Requests-1877143 ] make interwiki.py accept a hint at namespace warning
by SourceForge.net 29 Feb '08

29 Feb '08

Feature Requests item #1877143, was opened at 2008-01-22 10:31 Message generated for change (Comment added) made by purodha You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1877143&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Priority: 5 Private: No Submitted By: Purodha B Blissenbach (purodha) Assigned to: Nobody/Anonymous (nobody) Summary: make interwiki.py accept a hint at namespace warning Initial Comment: WARNING: [[ksh:Metmaacher:Purbo T]] is in namespace 2, but [[dsb:Benutzer:Purbo T]] is in namespace 0. Follow it anyway? ([y]es, [n]o) n should better be: WARNING: [[ksh:Metmaacher:Purbo T]] is in namespace 2, but [[dsb:Benutzer:Purbo T]] is in namespace 0. Follow it anyway? ([y]es, [n]o, [a]dd replacement) n Reason of the warning is an update of the language-file in the dsb wiki, which alters Benutzer -> Wužywar (away from the German fallback) Thus many previously set intwerwiki links need a namespace change. While this could be dealt with using a general search/replace, interwiki.py should not loose the interwiki links while it is operating before all links were adjusted. A manual replacement (hint) would be the ideal solution here. If possible, it would be nice to have a typein "user:" automatically expanded to "User:Purbo T" in such cases, but that is rather a gimmick. ---------------------------------------------------------------------- >Comment By: Purodha B Blissenbach (purodha) Date: 2008-02-29 13:49 Message: Logged In: YES user_id=46450 Originator: YES See also these two edits: http://mi.wikipedia.org/w/index.php?title=Category:M%C4%81tauranga_huaota&d… http://mi.wikipedia.org/w/index.php?title=Category%3AM%C4%81tauranga_huaota… This gives another motivation, or justification, for the suggested feature. With the feature added: When seeing a possible solution, you could enter that solution immediately, saving time, server load and, more often than not, another bot run. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1877143&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1904587 ] Interwiki.py - better language fallbacks: dsb, hsb, stq
by SourceForge.net 29 Feb '08

29 Feb '08

Patches item #1904587, was opened at 2008-02-29 10:55 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1904587&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Purodha B Blissenbach (purodha) Assigned to: Nobody/Anonymous (nobody) Summary: Interwiki.py - better language fallbacks: dsb, hsb, stq Initial Comment: svn diff wikipedia.py Index: wikipedia.py =================================================================== --- wikipedia.py (revision 5095) +++ wikipedia.py (working copy) @@ -5525,10 +5527,14 @@ return ['ar','tr'] if code=='sk': return ['cs'] - if code in ['bar','hsb','ksh']: + if code in ['bar','ksh','stq']: return ['de'] if code in ['als','lb']: return ['de','fr'] + if code=='dsb': + return ['hsb','de'] + if code=='hsb': + return ['dsb','de'] if code=='io': return ['eo'] if code in ['an','ast','ay','ca','gn','nah','qu']: ---- Adds Saterlandic Frisian (Seeltersk) Makes Upper/Lower Sorbian being fallbacks for each other before resorting to German. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1904587&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-1792829 ] Bug in wikipedia.py and workaround
by SourceForge.net 28 Feb '08

28 Feb '08

Bugs item #1792829, was opened at 2007-09-11 19:28 Message generated for change (Comment added) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1792829&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: Bug in wikipedia.py and workaround Initial Comment: I am using snapshot 2007-08-11: Bug #1733835 appears in this snapshot with a different error message: "Changing page [[de:Aventurischer Index: Buchstabe J/fehlt noch]] WARNING: No text area found on www.wiki-aventurica.de/index.php?title=MediaWiki: viewsource&action=edit. Maybe the server is down. Retrying in 1 minutes..." We don't even have a page named "MediaWiki:viewsource" in our Wiki. After I changed line 1176 from "if data != u'':" to "if data != u'' and re.search(r'[^\n]', data) != None:" again, it works properly. I did not have this bug in snapshot-2007-06-19, but in snapshot-20070605 and now. ---------------------------------------------------------------------- >Comment By: Russell Blau (russblau) Date: 2008-02-28 18:16 Message: Logged In: YES user_id=855050 Originator: NO Should be fixed by r5095. ---------------------------------------------------------------------- Comment By: Bernhard Mayr (falk_steinhauer) Date: 2007-09-11 19:30 Message: Logged In: YES user_id=1810075 Originator: NO I reported the bug. Guess login-cookie was to old. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1792829&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-1766974 ] colon omitted in link to category page
by SourceForge.net 28 Feb '08

28 Feb '08

Bugs item #1766974, was opened at 2007-08-03 10:24 Message generated for change (Settings changed) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1766974&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Invalid Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: colon omitted in link to category page Initial Comment: Consider a page X containing only the following wiki markup: [[:Category:some category]] Let page be a Page object for X. Then links = page.linkedPages() for link in links: wikipedia.output(page.title()) wikipedia.output(page.aslink()) outputs: Category:some category [[Category:some category]] i.e. the leading colon has been swallowed. I suggest the colon be kept due to the semantic difference between [[:Category:some category]] and [[Category:some category]]. (A similar problem might exist with interlanguage links; I haven't tested, though) ---------------------------------------------------------------------- Comment By: Russell Blau (russblau) Date: 2007-08-03 11:01 Message: Logged In: YES user_id=855050 Originator: NO Page.aslink() has an optional parameter "textlink" (defaults to False); if True, the leading colon will be output. This is a relatively recent addition to the framework, and many bots have not yet been updated to use it. This parameter does not yet work for interwiki links. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1766974&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-1733835 ] Bug in wikipedia.py and unsatisfying fix
by SourceForge.net 28 Feb '08

28 Feb '08

Bugs item #1733835, was opened at 2007-06-08 18:24 Message generated for change (Comment added) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1733835&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: Bug in wikipedia.py and unsatisfying fix Initial Comment: I am using snapshot-20070605. I recognized a bug in wikipedia.py (line 1146, the last one in the code segment below): # Submit the prepared information if self.site().hostname() in config.authenticate.keys(): predata.append(("Content-type","application/x-www-form-urlencoded")) predata.append(("User-agent", useragent)) data = self.site().urlEncode(predata) response = urllib2.urlopen(urllib2.Request('http://' + self.site().hostname() + address, data)) # I'm not sure what to check in this case, so I just assume things went ok. # Very naive, I agree. data = u'' else: try: response, data = self.site().postForm(address, predata, sysop) except httplib.BadStatusLine, line: raise PageNotSaved('Bad status line: %s' % line) if data != u'' and re.search(r'[^\n]', data) != None: I added the condition behind the "and", because after I accepted changes for a given page "data" was unequal to ''. Instead "data" was a string of 4 newlines. Hope It'll help you. Best regards, Falk Steinhauer Wiki Aventurica ---------------------------------------------------------------------- >Comment By: Russell Blau (russblau) Date: 2008-02-28 18:03 Message: Logged In: YES user_id=855050 Originator: NO Addressed in r5095. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1733835&group_…

1 0

[Pywikipedia-l] SVN: [5095] trunk/pywikipedia/wikipedia.py
by russblau＠svn.wikimedia.org 28 Feb '08

28 Feb '08

Revision: 5095 Author: russblau Date: 2008-02-28 23:03:10 +0000 (Thu, 28 Feb 2008) Log Message: ----------- Bug #1733835 Modified Paths: -------------- trunk/pywikipedia/wikipedia.py Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2008-02-28 22:58:33 UTC (rev 5094) +++ trunk/pywikipedia/wikipedia.py 2008-02-28 23:03:10 UTC (rev 5095) @@ -1364,10 +1364,9 @@ if retry_delay > 30: retry_delay = 30 continue - - + # We are expecting a 302 to the action=view page. I'm not sure why this was removed in r5019 - if data != u"": + if data.strip() != u"": # Something went wrong, and we don't know what. Show the # HTML code that hopefully includes some error message. output(u"ERROR: Unexpected response from wiki server.") @@ -1375,11 +1374,11 @@ output(data) # Unexpected responses should raise an error and not pass, # be it silently or loudly. This should raise an error - + if 'name="wpTextbox1"' in data and 'var wgAction = "submit"' in data: # We are on the preview page, so the page was not saved raise PageNotSaved - + return response.status, response.reason, data def canBeEdited(self): @@ -2223,12 +2222,12 @@ if duration == 'none' or duration == None: duration = 'infinite' if cascading == False: cascading = '0' else: cascading = '1' - + if edit != 'sysop' or move != 'sysop': # You can't block a page as autoconfirmed and cascading, prevent the error cascading = '0' output(u"NOTE: The page can't be blocked with cascading and not also with only-sysop. Set cascading \"off\"") - + predata = { 'mwProtect-cascade': cascading, 'mwProtect-level-edit': edit, @@ -2860,7 +2859,7 @@ def __call__(self, requestsize=1): """ Block the calling program if the throttle time has not expired. - + Parameter requestsize is the number of Pages to be read/written; multiply delay time by an appropriate factor. """ @@ -3077,7 +3076,7 @@ def getLanguageLinks(text, insite = None, pageLink = "[[]]"): """ Return a dict of interlanguage links found in text. - + Dict uses language codes as keys and Page objects as values. Do not call this routine directly, use Page.interwiki() method instead.

1 0

[Pywikipedia-l] SVN: [5094] trunk/pywikipedia/redirect.py
by russblau＠svn.wikimedia.org 28 Feb '08

28 Feb '08

Revision: 5094 Author: russblau Date: 2008-02-28 22:58:33 +0000 (Thu, 28 Feb 2008) Log Message: ----------- Improve screening for malformed redirect targets, and don't use "dict" as a local variable name. Modified Paths: -------------- trunk/pywikipedia/redirect.py Modified: trunk/pywikipedia/redirect.py =================================================================== --- trunk/pywikipedia/redirect.py 2008-02-28 18:56:51 UTC (rev 5093) +++ trunk/pywikipedia/redirect.py 2008-02-28 22:58:33 UTC (rev 5094) @@ -114,7 +114,7 @@ targets are the values. ''' xmlFilename = self.xmlFilename - dict = {} + redict = {} # open xml dump and read page titles out of it dump = xmlreader.XmlDump(xmlFilename) site = wikipedia.getSite() @@ -151,23 +151,24 @@ source = entry.title.replace(' ', '_') target = target.replace(' ', '_') # remove leading and trailing whitespace - target = target.strip() + target = target.strip('_') # capitalize the first letter if not wikipedia.getSite().nocapitalize: - source = source[0].upper() + source[1:] - target = target[0].upper() + target[1:] + source = source[:1].upper() + source[1:] + target = target[:1].upper() + target[1:] if '#' in target: - target = target[:target.index('#')] + target = target[:target.index('#')].rstrip("_") if '|' in target: wikipedia.output( u'HINT: %s is a redirect with a pipelink.' % entry.title) - target = target[:target.index('|')] - dict[source] = target + target = target[:target.index('|')].rstrip("_") + if target: # in case preceding steps left nothing + redict[source] = target if alsoGetPageTitles: - return dict, pageTitles + return redict, pageTitles else: - return dict + return redict def retrieve_broken_redirects(self): if self.xmlFilename == None: @@ -216,16 +217,16 @@ for redir_name in redir_names: yield redir_name else: - dict = self.get_redirects_from_dump() + redict = self.get_redirects_from_dump() num = 0 - for (key, value) in dict.iteritems(): + for (key, value) in redict.iteritems(): num += 1 # check if the value - that is, the redirect target - is a # redirect as well - if num > self.offset and dict.has_key(value): + if num > self.offset and redict.has_key(value): yield key wikipedia.output(u'\nChecking redirect %i of %i...' - % (num + 1, len(dict))) + % (num + 1, len(redict))) class RedirectRobot: def __init__(self, action, generator, always = False):

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-1725373 ] redirect.py double -xml fails to find all double redirects
by SourceForge.net 28 Feb '08

28 Feb '08

Bugs item #1725373, was opened at 2007-05-25 04:22 Message generated for change (Settings changed) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1725373&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Byrial Ole Jensen (byrial) Assigned to: Nobody/Anonymous (nobody) Summary: redirect.py double -xml fails to find all double redirects Initial Comment: redirect.py double -xml fails to find all double redirects. For example dawiki-20070522-pages-meta-current.xml contains 99 double redirects, redirect.py could only find 6 of these and correct 5 (The 6. was a redirect directly to itself). The full list of the 99 double redirects is at http://da.wikipedia.org/wiki/Wikipedia:Dobbelte_omdirigeringer (Permanent link in case the page is edited: http://da.wikipedia.org/w/index.php?title=Wikipedia:Dobbelte_omdirigeringer…). PS. It would also be nice to an option to read the double redirects from a file. ---------------------------------------------------------------------- >Comment By: Russell Blau (russblau) Date: 2008-02-28 17:57 Message: Logged In: YES user_id=855050 Originator: NO Not sure when it was done, but the current version of redirect.py contains code that should have fixed this bug. ---------------------------------------------------------------------- Comment By: Byrial Ole Jensen (byrial) Date: 2007-05-25 13:42 Message: Logged In: YES user_id=23252 Originator: YES I found that all the not found double redirects have a target which contain spaces and therefore made this patch to fix the problem: RCS file: /cvsroot/pywikipediabot/pywikipedia/redirect.py,v retrieving revision 1.56 diff -u -r1.56 redirect.py --- redirect.py 11 May 2007 11:42:27 -0000 1.56 +++ redirect.py 25 May 2007 17:37:26 -0000 @@ -110,9 +110,9 @@ break # if the redirect does not link to another wiki if target: - target = target.replace(' ', '_') # remove leading and trailing whitespace target = target.strip() + target = target.replace('_', ' ') # capitalize the first letter if not wikipedia.getSite().nocapitalize: target = target[0].upper() + target[1:] It solves the problem when you get double redirects from an XML dump. However I guess that the patch as is will break fixing double redirects fetched from [[Special:DoubleRedirects]], but this is not tested. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1725373&group_…

1 0

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot February 2008