pywikibot October 2008

pywikibot@lists.wikimedia.org

25 participants
195 discussions

[Pywikipedia-l] [ pywikipediabot-Bugs-2180414 ] reflinks.py
by SourceForge.net 19 Oct '08

19 Oct '08

Bugs item #2180414, was opened at 2008-10-19 20:38 Message generated for change (Settings changed) made by nicdumz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180414&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. >Category: None Group: None >Status: Closed >Resolution: Fixed Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: reflinks.py Initial Comment: version:Pywikipedia [http] trunk/pywikipedia (r5968, Oct 14 2008, 19:22:40) Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] Error:use reflinks.py on Persian wikipedia have this error: Traceback (most recent call last): File "C:\py\reflinks.py", line 733, in <module> main() File "C:\py\reflinks.py", line 728, in main bot = ReferencesRobot(generator, always, limit, ignorepdf) File "C:\py\reflinks.py", line 361, in __init__ % self.stopPage.aslink()) AttributeError: 'unicode' object has no attribute 'aslink' ---------------------------------------------------------------------- Comment By: NicDumZ Nicolas Dumazet (nicdumz) Date: 2008-10-20 04:42 Message: Please update your working copy. This bug, as with the others you reported on IRC, have been fixed since. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180414&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-2180414 ] reflinks.py
by SourceForge.net 19 Oct '08

19 Oct '08

Bugs item #2180414, was opened at 2008-10-19 20:38 Message generated for change (Comment added) made by nicdumz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180414&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: other Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: reflinks.py Initial Comment: version:Pywikipedia [http] trunk/pywikipedia (r5968, Oct 14 2008, 19:22:40) Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] Error:use reflinks.py on Persian wikipedia have this error: Traceback (most recent call last): File "C:\py\reflinks.py", line 733, in <module> main() File "C:\py\reflinks.py", line 728, in main bot = ReferencesRobot(generator, always, limit, ignorepdf) File "C:\py\reflinks.py", line 361, in __init__ % self.stopPage.aslink()) AttributeError: 'unicode' object has no attribute 'aslink' ---------------------------------------------------------------------- Comment By: NicDumZ Nicolas Dumazet (nicdumz) Date: 2008-10-20 04:42 Message: Please update your working copy. This bug, as with the others you reported on IRC, have been fixed since. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180414&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-2180544 ] Bug in wikipedia.py and corresponding fix
by SourceForge.net 19 Oct '08

19 Oct '08

Bugs item #2180544, was opened at 2008-10-19 19:36 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180544&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Bernhard Mayr (falk_steinhauer) Assigned to: Nobody/Anonymous (nobody) Summary: Bug in wikipedia.py and corresponding fix Initial Comment: Pywikipedia [http] trunk/pywikipedia (r6000, Oct 19 2008, 13:59:03) Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] The line preceeding to line 5229 of wikipedia.py needs to return from function allpages() when we catched the NotImplementedError. Otherwise every robot will crash when being used for older wikis since the function steps through code that is only appropriate for new wikis. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180544&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-2180414 ] reflinks.py
by SourceForge.net 19 Oct '08

19 Oct '08

Bugs item #2180414, was opened at 2008-10-19 18:38 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180414&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: other Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: reflinks.py Initial Comment: version:Pywikipedia [http] trunk/pywikipedia (r5968, Oct 14 2008, 19:22:40) Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] Error:use reflinks.py on Persian wikipedia have this error: Traceback (most recent call last): File "C:\py\reflinks.py", line 733, in <module> main() File "C:\py\reflinks.py", line 728, in main bot = ReferencesRobot(generator, always, limit, ignorepdf) File "C:\py\reflinks.py", line 361, in __init__ % self.stopPage.aslink()) AttributeError: 'unicode' object has no attribute 'aslink' ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2180414&group_…

1 0

[Pywikipedia-l] SVN: [6000] trunk/pywikipedia/wikipedia.py
by filnik＠svn.wikimedia.org 19 Oct '08

19 Oct '08

Revision: 6000 Author: filnik Date: 2008-10-19 13:59:03 +0000 (Sun, 19 Oct 2008) Log Message: ----------- Rewrite of the newimages() function, to get the data from the APIs Modified Paths: -------------- trunk/pywikipedia/wikipedia.py Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2008-10-19 13:58:24 UTC (rev 5999) +++ trunk/pywikipedia/wikipedia.py 2008-10-19 13:59:03 UTC (rev 6000) @@ -5048,29 +5048,51 @@ if not repeat: break - def newimages(self, number = 10, repeat = False): - """Yield ImagePages from Special:Log&type=upload""" + def newimages(self, number = 100, lestart = None, leend = None, leuser = None, letitle = None, repeat = False): + """ + Yield ImagePages from APIs, call: action=query&list=logevents&letype=upload&lelimit=500 - seen = set() - regexp = re.compile(r'(?:<li[^>]*>|<div class="mw-log-entry">)(?P<date>.+?)\s+<a href=.*?>(?P<user>.+?)</a>\s+\(.+?</a>\).*?<a href=".*?"(?P<new> class="new")? title=".*?"\s*>(?P<image>.+?)</a>(?:.*?<span class="comment">\((?P<comment>.*?)\)</span>)?', re.UNICODE) + Options directly from APIs: + --- + Parameters: + Default: ids|title|type|user|timestamp|comment|details + lestart - The timestamp to start enumerating from. + leend - The timestamp to end enumerating. + ledir - In which direction to enumerate. + One value: newer, older + Default: older + leuser - Filter entries to those made by the given user. + letitle - Filter entries to those related to a page. + lelimit - How many total event entries to return. + No more than 500 (5000 for bots) allowed. + Default: 10 + """ + params = { + 'action' :'query', + 'list' :'logevents', + 'letype' :'upload', + 'lelimit' :int(number), + } + if lestart != None: params['lestart'] = lestart + if leend != None: params['leend'] = leend + if leend != None: params['leuser'] = leuser + if leend != None: params['letitle'] = letitle + + data = query.GetData(params, + useAPI = True, encodeTitle = False) + imagesData = data['query']['logevents'] while True: - path = self.log_address(number, mode = 'upload') - get_throttle() - html = self.getUrl(path) - for m in regexp.finditer(html): - image = m.group('image') - - if image not in seen: - seen.add(image) - - if m.group('new'): - output(u"Image \'%s\' has been deleted." % image) - continue - - date = m.group('date') - user = m.group('user') - comment = m.group('comment') or '' - yield ImagePage(self, image), date, user, comment + for imageData in imagesData: + try: + comment = imageData['comment'] + except KeyError: + comment = '' + pageid = imageData['pageid'] + title = imageData['title'] + timestamp = imageData['timestamp'] + logid = imageData['logid'] + user = imageData['user'] + yield ImagePage(self, title), timestamp, user, comment if not repeat: break

1 0

[Pywikipedia-l] SVN: [5999] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 19 Oct '08

19 Oct '08

Revision: 5999 Author: filnik Date: 2008-10-19 13:58:24 +0000 (Sun, 19 Oct 2008) Log Message: ----------- Some bugfixes, adapting to the new newimages() function Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-10-19 09:55:03 UTC (rev 5998) +++ trunk/pywikipedia/checkimages.py 2008-10-19 13:58:24 UTC (rev 5999) @@ -535,11 +535,13 @@ self.smartdetection = smartdetection if self.smartdetection: self.list_licenses = self.load_licenses() - def setParameters(self, imageName): + def setParameters(self, imageName, timestamp, uploader): """ Function to set parameters, now only image but maybe it can be used for others in "future" """ self.imageName = imageName # Defing the image's Page Object self.image = wikipedia.ImagePage(self.site, u'%s%s' % (self.image_namespace, self.imageName)) + self.timestamp = timestamp + self.uploader = uploader def report(self, newtext, image_to_report, notification = None, head = None, notification2 = None, unver = True, commTalk = None, commImage = None): """ Function to make the reports easier. """ @@ -614,10 +616,12 @@ # has upload the image (FixME: Rewrite a bit this part) if put: reportPageObject.put(reportPageText + self.newtext, comment = self.commImage, minorEdit = True) - # paginetta it's the image page object. - + # paginetta it's the image page object. try: - nick = reportPageObject.getLatestUploader()[0] + if reportPageObject == self.image and self.uploader != None: + nick = self.uploader + else: + nick = reportPageObject.getLatestUploader()[0] except wikipedia.NoPage: wikipedia.output(u"Seems that %s hasn't the image at all, but there is something in the description..." % self.image_to_report) repme = u"\n*[[:Image:%s]] problems '''with the APIs'''" @@ -860,7 +864,10 @@ time_list = list() for duplicate in duplicates: DupePage = wikipedia.ImagePage(self.site, u'Image:%s' % duplicate) - imagedata = DupePage.getLatestUploader()[1] + if DupePage == self.image and self.timestamp != None: + imagedata = self.timestamp + else: + imagedata = DupePage.getLatestUploader()[1] # '2008-06-18T08:04:29Z' data = time.strptime(imagedata, u"%Y-%m-%dT%H:%M:%SZ") data_seconds = time.mktime(data) @@ -1176,7 +1183,7 @@ first x seconds. """ #http://pytz.sourceforge.net/ <- maybe useful? - imagedata = self.image.getLatestUploader()[1] + imagedata = self.timestamp # '2008-06-18T08:04:29Z' img_time = datetime.datetime.strptime(imagedata, u"%Y-%m-%dT%H:%M:%SZ") #not relative to localtime now = datetime.datetime.strptime(str(datetime.datetime.utcnow()).split('.')[0], "%Y-%m-%d %H:%M:%S") #timezones are UTC @@ -1283,7 +1290,7 @@ something = ['{{'] # Don't put "}}" here, please. Useless and can give problems. # Unused file extensions. Does not contain PDF. notallowed = ("xcf", "xls", "sxw", "sxi", "sxc", "sxd") - parentesi = False # parentesi are these in italian: { ( ) } [] + brackets = False delete = False extension = self.imageName.split('.')[-1] # get the extension from the image's name # Load the notification messages @@ -1309,10 +1316,10 @@ self.imageCheckText = self.image.get() self.imageFullText = self.imageCheckText except wikipedia.NoPage: - wikipedia.output(u"Skipping %s because it has been deleted." % imageName) + wikipedia.output(u"Skipping %s because it has been deleted." % self.imageName) return True except wikipedia.IsRedirectPage: - wikipedia.output(u"The file description for %s is a redirect?!" % imageName) + wikipedia.output(u"The file description for %s is a redirect?!" % self.imageName) return True # Delete the fields where the templates cannot be loaded regex_nowiki = re.compile(r'<nowiki>(.*?)</nowiki>', re.DOTALL) @@ -1329,7 +1336,7 @@ for a_word in something: # something is the array with {{, MIT License and so on. if a_word in self.imageCheckText: # There's a template, probably a license (or I hope so) - parentesi = True + brackets = True # Is the extension allowed? (is it an image or f.e. a .xls file?) for parl in notallowed: if parl.lower() in extension.lower(): @@ -1366,7 +1373,7 @@ wikipedia.output(u"Skipping the image...") self.some_problem = False return True - elif parentesi == True: + elif brackets == True: seems_ok = False license_found = None if smartdetection: @@ -1374,7 +1381,7 @@ else: printWithTimeZone(u"%s seems ok..." % self.imageName) # It works also without this... but i want only to be sure ^^ - parentesi = False + brackets = False return True elif delete == True: wikipedia.output(u"%s is not a file!" % self.imageName) @@ -1556,7 +1563,7 @@ normal = False # Ensure that normal is False # Normal True? Take the default generator if normal == True: - generator = pagegenerators.NewimagesPageGenerator(number = limit, site = site) + generator = site.newimages(number = limit) # if urlUsed and regexGen, get the source for the generator if urlUsed == True and regexGen == True: textRegex = site.getUrl(regexPageUrl, no_hostname = True) @@ -1593,12 +1600,22 @@ 'image:' not in image.title().lower(): wikipedia.output(u'%s seems not an image, skip it...' % image.title()) continue + if normal: + imageData = image + image = imageData[0] + timestamp = imageData[1] + uploader = imageData[2] + comment = imageData[3] # useless, in reality.. + else: + timestamp = None + uploader = None + comment = None # useless, also this, let it here for further developments try: imageName = image.title().split(image_namespace)[1] # Deleting the namespace (useless here) except IndexError:# Namespace image not found, that's not an image! Let's skip... wikipedia.output(u"%s is not an image, skipping..." % image.title()) continue - mainClass.setParameters(imageName) # Setting the image for the main class + mainClass.setParameters(imageName, timestamp, uploader) # Setting the image for the main class # If I don't inizialize the generator, wait part and skip part are useless if wait: # Let's sleep...

1 0

[Pywikipedia-l] SVN: [5998] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 19 Oct '08

19 Oct '08

Revision: 5998 Author: filnik Date: 2008-10-19 09:55:03 +0000 (Sun, 19 Oct 2008) Log Message: ----------- BUGFIX O_O continue instead of a break Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-10-19 09:52:04 UTC (rev 5997) +++ trunk/pywikipedia/checkimages.py 2008-10-19 09:55:03 UTC (rev 5998) @@ -1629,7 +1629,7 @@ time.sleep(time_sleep) elif repeat == False: wikipedia.output(u"\t\t\t>> STOP! <<") - continue # Exit + break # Exit # Here there is the main loop. I'll take all the (name of the) images and then i'll check them. if __name__ == "__main__":

1 0

[Pywikipedia-l] SVN: [5997] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 19 Oct '08

19 Oct '08

Revision: 5997 Author: filnik Date: 2008-10-19 09:52:04 +0000 (Sun, 19 Oct 2008) Log Message: ----------- ...anotherone... -_- Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-10-19 09:51:03 UTC (rev 5996) +++ trunk/pywikipedia/checkimages.py 2008-10-19 09:52:04 UTC (rev 5997) @@ -1265,7 +1265,6 @@ break elif find_tipe.lower() == 'find': if re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) != []: - print re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) self.some_problem = True self.text_used = text self.head_used = head_2

1 0

[Pywikipedia-l] SVN: [5996] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 19 Oct '08

19 Oct '08

Revision: 5996 Author: filnik Date: 2008-10-19 09:51:03 +0000 (Sun, 19 Oct 2008) Log Message: ----------- forgot a debug print -.-' Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-10-19 09:42:51 UTC (rev 5995) +++ trunk/pywikipedia/checkimages.py 2008-10-19 09:51:03 UTC (rev 5996) @@ -1255,7 +1255,6 @@ searchResults = re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) if searchResults != []: if searchResults[0] == self.imageCheckText.lower(): - print searchResults[0] self.some_problem = True self.text_used = text self.head_used = head_2 @@ -1264,7 +1263,7 @@ self.summary_used = summary self.mex_used = mexCatched break - elif find_tipe.lower() == 'find': + elif find_tipe.lower() == 'find': if re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) != []: print re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) self.some_problem = True

1 0

[Pywikipedia-l] SVN: [5995] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 19 Oct '08

19 Oct '08

Revision: 5995 Author: filnik Date: 2008-10-19 09:42:51 +0000 (Sun, 19 Oct 2008) Log Message: ----------- Now the settings find uses regex, not simple text Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-10-19 09:19:40 UTC (rev 5994) +++ trunk/pywikipedia/checkimages.py 2008-10-19 09:42:51 UTC (rev 5995) @@ -1252,7 +1252,21 @@ mexCatched = tupla[8] for k in find_list: if find_tipe.lower() == 'findonly': - if k.lower() == self.imageCheckText.lower(): + searchResults = re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) + if searchResults != []: + if searchResults[0] == self.imageCheckText.lower(): + print searchResults[0] + self.some_problem = True + self.text_used = text + self.head_used = head_2 + self.imagestatus_used = imagestatus + self.name_used = name + self.summary_used = summary + self.mex_used = mexCatched + break + elif find_tipe.lower() == 'find': + if re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) != []: + print re.findall(r'%s' % k.lower(), self.imageCheckText.lower()) self.some_problem = True self.text_used = text self.head_used = head_2 @@ -1260,16 +1274,6 @@ self.name_used = name self.summary_used = summary self.mex_used = mexCatched - break - elif find_tipe.lower() == 'find': - if k.lower() in self.imageCheckText.lower(): - self.some_problem = True - self.text_used = text - self.head_used = head_2 - self.imagestatus_used = imagestatus - self.name_used = name - self.summary_used = summary - self.mex_used = mexCatched continue def checkStep(self, smartdetection):

1 0

← Newer
1
...
6
7
8
9
10
11
12
...
20
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot October 2008