pywikibot April 2008

pywikibot@lists.wikimedia.org

34 participants
213 discussions

[Pywikipedia-l] SVN: [5207] trunk/pywikipedia/noreferences.py
by nicdumz＠svn.wikimedia.org 12 Apr '08

12 Apr '08

Revision: 5207 Author: nicdumz Date: 2008-04-12 10:44:10 +0000 (Sat, 12 Apr 2008) Log Message: ----------- Applying my patch 1898557, better indenting for noreferences.py Modified Paths: -------------- trunk/pywikipedia/noreferences.py Modified: trunk/pywikipedia/noreferences.py =================================================================== --- trunk/pywikipedia/noreferences.py 2008-04-12 10:39:08 UTC (rev 5206) +++ trunk/pywikipedia/noreferences.py 2008-04-12 10:44:10 UTC (rev 5207) @@ -254,7 +254,7 @@ # Create a new section for the references tag for section in wikipedia.translate(page.site(), placeBeforeSections): # Find out where to place the new section - sectionR = re.compile(r'\r\n=+ *%s *=+\r\n' % section) + sectionR = re.compile(r'\r\n(?P<ident>=+) *%s *=+\r\n' % section) index = 0 while index < len(oldText): match = sectionR.search(oldText, index) @@ -265,7 +265,8 @@ else: wikipedia.output(u'Adding references section before %s section...\n' % section) index = match.start() - self.createReferenceSection(page, index) + ident = match.group('ident') + self.createReferenceSection(page, index, ident) return else: break @@ -296,9 +297,9 @@ index = len(tmpText) self.createReferenceSection(page, index) - def createReferenceSection(self, page, index): + def createReferenceSection(self, page, index, ident = '=='): oldText = page.get() - newSection = u'\n== %s ==\n\n<references/>\n' % wikipedia.translate(page.site(), referencesSections)[0] + newSection = u'\n%s %s %s\n\n<references/>\n' % (ident, wikipedia.translate(page.site(), referencesSections)[0], ident) newText = oldText[:index] + newSection + oldText[index:] self.save(page, newText)

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1898557 ] noreferences.py : better ident for the references section
by SourceForge.net 12 Apr '08

12 Apr '08

Patches item #1898557, was opened at 2008-02-21 10:12 Message generated for change (Comment added) made by nicdumz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898557&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: NicDumZ — Nicolas Dumazet (nicdumz) Assigned to: Nobody/Anonymous (nobody) Summary: noreferences.py : better ident for the references section Initial Comment: When adding a <references/> tag, watch for the ident of the next section, instead of using a simple == %s == ident. Before : == See Also == + == References == + + <references/> === External links === Now : == See Also == + === References === + + <references/> === External links === Cheers :) Nicolas Dumazet. ---------------------------------------------------------------------- >Comment By: NicDumZ — Nicolas Dumazet (nicdumz) Date: 2008-04-12 12:43 Message: Logged In: YES user_id=1963242 Originator: YES applying it in r5207 ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898557&group_…

1 0

[Pywikipedia-l] SVN: [5206] trunk/pywikipedia/interwiki-graphs/
by nicdumz＠svn.wikimedia.org 12 Apr '08

12 Apr '08

Revision: 5206 Author: nicdumz Date: 2008-04-12 10:39:08 +0000 (Sat, 12 Apr 2008) Log Message: ----------- updating svn:ignore on interwiki-graphs/ using all possible output formats of graphviz ( http://www.graphviz.org/doc/info/output.html ) Property Changed: ---------------- trunk/pywikipedia/interwiki-graphs/ Property changes on: trunk/pywikipedia/interwiki-graphs ___________________________________________________________________ Name: svn:ignore - *.png *.svg + *.bmp *.canon *.dot *.cmap *.dia *.eps *.fig *.gd *.gd2 *.gif *.gtk *.hpgl *.ico *.imap *.cmapx *.jpg *.jpeg *.jpe *.mif *.mp *.pcl *.pdf *.pic *.plain *.png *.ps *.ps2 *.svg *.svgz *.tga *.tif *.tiff *.vml *.vlmz *.vrml *.vtx *.wbmp *.xlib

1 0

[Pywikipedia-l] SVN: [5205] trunk/pywikipedia
by nicdumz＠svn.wikimedia.org 12 Apr '08

12 Apr '08

Revision: 5205 Author: nicdumz Date: 2008-04-12 10:15:11 +0000 (Sat, 12 Apr 2008) Log Message: ----------- > > > Yeehee !! I can commit :) < < < Repairing the weblink (Special:Linksearch) pagegenerators which has been broken for ages : BEFORE : ~/projets/pywikipedia\ > python pagegenerators.py -weblink:myspace.com -lang:fr | wc -l Checked for running processes. 1 processes currently running, including the current process. Querying [[Special:Linksearch]]... 453 AFTER : ~/projets/devpywiki\ > python pagegenerators.py -weblink:myspace.com -lang:fr | wc -l Checked for running processes. 1 processes currently running, including the current process. Querying [[Special:Linksearch]]... 2199 Modified Paths: -------------- trunk/pywikipedia/pagegenerators.py trunk/pywikipedia/wikipedia.py Modified: trunk/pywikipedia/pagegenerators.py =================================================================== --- trunk/pywikipedia/pagegenerators.py 2008-04-11 20:29:11 UTC (rev 5204) +++ trunk/pywikipedia/pagegenerators.py 2008-04-12 10:15:11 UTC (rev 5205) @@ -411,7 +411,7 @@ """ if site is None: site = wikipedia.getSite() - for page in site.linksearch(link): + for page in site.linksearch(link, limit=step): yield page def SearchPageGenerator(query, number = 100, namespaces = None, site = None): @@ -872,6 +872,9 @@ transclusionPage = wikipedia.Page(wikipedia.getSite(), 'Template:%s' % transclusionPageTitle) gen = ReferringPageGenerator(transclusionPage, onlyTemplateInclusion = True) elif arg.startswith('-start'): + if arg.startswith('-startxml'): + wikipedia.output(u'-startxml : wrong parameter') + sys.exit() firstPageTitle = arg[7:] if not firstPageTitle: firstPageTitle = wikipedia.input(u'At which page do you want to start?') Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2008-04-11 20:29:11 UTC (rev 5204) +++ trunk/pywikipedia/wikipedia.py 2008-04-12 10:15:11 UTC (rev 5205) @@ -4876,32 +4876,45 @@ else: break - def linksearch(self, siteurl): + def linksearch(self, siteurl, limit=500): """Yield Pages from results of Special:Linksearch for 'siteurl'.""" if siteurl.startswith('*.'): siteurl = siteurl[2:] output(u'Querying [[Special:Linksearch]]...') cache = [] + R = re.compile('title ?=\"(.*?)\"') for url in [siteurl, '*.' + siteurl]: - path = self.linksearch_address(url) - get_throttle() - html = self.getUrl(path) - loc = html.find('<div class="mw-spcontent">') - if loc > -1: - html = html[loc:] - loc = html.find('<div class="printfooter">') - if loc > -1: - html = html[:loc] - R = re.compile('title ?=\"(.*?)\"') - for title in R.findall(html): - if not siteurl in title: - # the links themselves have similar form - if title in cache: - continue - else: - cache.append(title) - yield Page(self, title) + offset = 0 + while True: + path = self.linksearch_address(url, limit=limit, offset=offset) + get_throttle() + html = self.getUrl(path) + #restricting the HTML source : + #when in the source, this div marks the beginning of the input + loc = html.find('<div class="mw-spcontent">') + if loc > -1: + html = html[loc:] + #when in the source, marks the end of the linklist + loc = html.find('<div class="printfooter">') + if loc > -1: + html = html[:loc] + #our regex fetches internal page links and the link they contain + links = R.findall(html) + if not links: + #no more page to be fetched for that link + break + for title in links: + if not siteurl in title: + # the links themselves have similar form + if title in cache: + continue + else: + cache.append(title) + yield Page(self, title) + offset += limit + + def __repr__(self): return self.family.name+":"+self.lang

1 0

[Pywikipedia-l] SVN: [5204] branches/rewrite/pywikibot
by russblau＠svn.wikimedia.org 11 Apr '08

11 Apr '08

Revision: 5204 Author: russblau Date: 2008-04-11 20:29:11 +0000 (Fri, 11 Apr 2008) Log Message: ----------- implemented getrevisions() [incomplete], and made modest changes elsewhere Modified Paths: -------------- branches/rewrite/pywikibot/data/api.py branches/rewrite/pywikibot/date.py branches/rewrite/pywikibot/page.py branches/rewrite/pywikibot/site.py branches/rewrite/pywikibot/throttle.py Modified: branches/rewrite/pywikibot/data/api.py =================================================================== --- branches/rewrite/pywikibot/data/api.py 2008-04-11 13:36:23 UTC (rev 5203) +++ branches/rewrite/pywikibot/data/api.py 2008-04-11 20:29:11 UTC (rev 5204) @@ -145,19 +145,23 @@ while True: # TODO catch http errors try: - if self.params.get("action", "") in ("login",): - rawdata = http.request(self.site, uri, method="POST", - headers={'Content-Type': - 'application/x-www-form-urlencoded'}, - body=params) - else: - uri = uri + "?" + params - rawdata = http.request(self.site, uri) - except Exception, e: #TODO: what exceptions can occur here? - logging.warning(traceback.format_exc()) - print uri, params - self.wait() - continue + self.site.sitelock.acquire() + try: + if self.params.get("action", "") in ("login",): + rawdata = http.request(self.site, uri, method="POST", + headers={'Content-Type': + 'application/x-www-form-urlencoded'}, + body=params) + else: + uri = uri + "?" + params + rawdata = http.request(self.site, uri) + except Exception, e: #TODO: what exceptions can occur here? + logging.warning(traceback.format_exc()) + print uri, params + self.wait() + continue + finally: + self.site.sitelock.release() if rawdata.startswith(u"unknown_action"): raise APIError(rawdata[:14], rawdata[16:]) try: @@ -197,7 +201,7 @@ if lag: logging.info( "Pausing due to database lag: " + info) - self.wait(int(lag.group("lag"))) + self.lag_wait(int(lag.group("lag"))) continue if code in (u'internal_api_error_DBConnectionError', ): self.wait() @@ -208,25 +212,32 @@ except TypeError: raise RuntimeError(result) - def wait(self, lag=None): + def wait(self): """Determine how long to wait after a failed request.""" self.max_retries -= 1 if self.max_retries < 0: raise TimeoutError("Maximum retries attempted without success.") - wait = self.retry_wait - if lag is not None: - # in case of database lag, wait half the lag time, - # but not less than 5 or more than 120 seconds - wait = max(5, min(lag // 2, 120)) logging.warn("Waiting %s seconds before retrying." % wait) - time.sleep(wait) - if lag is None: - self.retry_wait = min(120, self.retry_wait * 2) + time.sleep(self.retry_wait) + # double the next wait, but do not exceed 120 seconds + self.retry_wait = min(120, self.retry_wait * 2) + def lag_wait(self, lag): + """Wait due to server lag.""" + # unlike regular wait, this shuts down all access to site + self.site.sitelock.acquire() + try: + # wait at least 5 seconds, no more than 120 + wait = max(5, min(120, lag//2)) + logging.warn("Pausing %s seconds due to server lag." % wait) + time.sleep(wait) + finally: + self.site.sitelock.release() + class PageGenerator(object): """Iterator for response to a request of type action=query&generator=foo.""" - def __init__(self, generator="", **kwargs): + def __init__(self, generator, **kwargs): """ Required and optional parameters are as for C{Request}, except that action=query is assumed and generator is required. @@ -235,8 +246,6 @@ @type generator: str """ - if not generator: - raise ValueError("generator argument is required.") if generator not in self.limits: raise ValueError("Unrecognized generator '%s'" % generator) self.request = Request(action="query", generator=generator, **kwargs) @@ -261,7 +270,6 @@ self.resultkey = "pages" # element to look for in result # dict mapping generator types to their limit parameter names - limits = {'links': None, 'images': None, 'templates': None, @@ -348,6 +356,75 @@ return image +class PropertyGenerator(object): + """Generator for queries of type action=query&property=...""" + + def __init__(self, prop, **kwargs): + """ + Required and optional parameters are as for C{Request}, except that + action=query is assumed and prop is required. + + @param prop: the "property=" type from api.php + @type prop: str + + """ + self.request = Request(action="query", prop=prop, **kwargs) + if prop not in self.limits: + raise ValueError("Unrecognized property '%s'" % prop) + # set limit to max, if applicable + if self.limits[prop] and kwargs.pop("getAll", False): + self.request['g'+self.limits[generator]] = "max" + self.site = self.request.site + self.resultkey = prop # element to look for in result + + # dict mapping property types to their limit parameter names + limits = {'revisions': 'rvlimit', + 'imageinfo': 'iilimit', + 'info': None, + 'links': None, + 'langlinks': None, + 'images': None, + 'imageinfo': None, + 'templates': None, + 'categories': None, + 'extlinks': None, + } + + def __iter__(self): + """Iterate objects for elements found in response.""" + # this looks for the resultkey ''inside'' a <page> entry + while True: + self.site.get_throttle() + self.data = self.request.submit() + if not self.data or not isinstance(self.data, dict): + raise StopIteration + if not ("query" in self.data and "pages" in self.data["query"]): + raise StopIteration + pagedata = self.data["query"]["pages"].values() + assert len(pagedata)==1 + pagedata = pagedata[0] + if not self.resultkey in pagedata: + raise StopIteration + if isinstance(pagedata[self.resultkey], dict): + for v in pagedata[self.resultkey].itervalues(): + yield v + elif isinstance(pagedata[self.resultkey], list): + for v in pagedata[self.resultkey]: + yield v + else: + raise APIError("Unknown", + "Unknown format in ['%s'] value." + % self.resultkey, + data=pagedata[self.resultkey]) + if not "query-continue" in self.data: + return + if not self.resultkey in self.data["query-continue"]: + raise APIError("Unknown", + "Missing '%s' key in ['query-continue'] value.", + data=self.data["query-continue"]) + self.request.update(self.data["query-continue"][self.resultkey]) + + class LoginManager(login.LoginManager): """Supplies getCookie() method to use API interface.""" def getCookie(self, remember=True, captchaId=None, captchaAnswer=None): Modified: branches/rewrite/pywikibot/date.py =================================================================== --- branches/rewrite/pywikibot/date.py 2008-04-11 13:36:23 UTC (rev 5203) +++ branches/rewrite/pywikibot/date.py 2008-04-11 20:29:11 UTC (rev 5204) @@ -1,4 +1,4 @@ -# -*- coding: utf-8 -*- +# -*- coding: utf-8 -*- """ This file is not runnable, but it only consists of various lists which are required by some other programs. @@ -17,9 +17,7 @@ # used for date recognition import types import re -import wikipedia - # # Different collections of well known formats # @@ -1523,7 +1521,7 @@ """ """ for s in makeMonthNamedList( lang, pattern, capitalize ): - wikipedia.output( s ) + print( s ) def testMapEntry( formatName, showAll = True, value = None ): @@ -1542,7 +1540,7 @@ if value is not None: start, stop = value, value+1 if showAll: - wikipedia.output(u"Processing %s with limits from %d to %d and step %d" % (formatName, start,stop-1,step)) + print(u"Processing %s with limits from %d to %d and step %d" % (formatName, start,stop-1,step)) for code, convFunc in formats[formatName].iteritems(): # import time @@ -1555,18 +1553,21 @@ if newValue != value: raise AssertionError(" %s != %s: assert failed, values didn't match" % (newValue, value)) if showAll: - wikipedia.output(u"date.formats['%s']['%s'](%d): '%s' -> %d" % (formatName, code, value, convFunc(value), newValue)) + print(u"date.formats['%s']['%s'](%d): '%s' -> %d" % (formatName, code, value, convFunc(value), newValue)) except: - wikipedia.output(u"********** Error in date.formats['%s']['%s'](%d)" % (formatName, code, value)) + print(u"********** Error in date.formats['%s']['%s'](%d)" % (formatName, code, value)) raise -# wikipedia.output( u"%s\t%s\t%f" % (formatName, code, time.clock() - startClock) ) +# print( u"%s\t%s\t%f" % (formatName, code, time.clock() - startClock) ) def test(quick = False, showAll = False): - """This is a test function, to be used interactivelly to test entire format convesion map at once + """This is a test function, to be used interactively to test entire + format conversion map at once + Usage example: run python interpreter >>> import date >>> date.test() + """ for formatName in formats.keys(): @@ -1574,13 +1575,13 @@ testMapEntry( formatName, showAll, formatLimits[formatName][1] ) # Only test the first value in the test range else: testMapEntry( formatName, showAll ) # Extensive test! # Test decade rounding - wikipedia.output(u"'%s' complete." % formatName) + print(u"'%s' complete." % formatName) if quick: - #wikipedia.output(u'Date module quick consistency test passed') + #print(u'Date module quick consistency test passed') pass else: - wikipedia.output(u'Date module has been fully tested') + print(u'Date module has been fully tested') # Modified: branches/rewrite/pywikibot/page.py =================================================================== --- branches/rewrite/pywikibot/page.py 2008-04-11 13:36:23 UTC (rev 5203) +++ branches/rewrite/pywikibot/page.py 2008-04-11 20:29:11 UTC (rev 5204) @@ -283,7 +283,7 @@ raise self._getexception if force or not hasattr(self, "_revid") \ or not self._revid in self._revisions: - self.site().getrevisions(self, getText=True, ids=None, sysop=sysop) + self.site().getrevisions(self, getText=True, sysop=sysop) # TODO: Exception handling for no-page, redirects, etc. return self._revisions[self._revid].text @@ -307,7 +307,8 @@ "Page.getOldVersion(change_edit_time) option is deprecated.") if force or not oldid in self._revisions: self.site().getrevisions(self, getText=True, ids=oldid, - redirs=get_redirect, sysop=sysop) + sysop=sysop) + # TODO: what about redirects, errors? return self._revisions[oldid].text def permalink(self): @@ -678,7 +679,7 @@ else: limit = revCount return self.site().getrevisions(self, withText=False, - older=reverseOrder, limit=limit) + older=not reverseOrder, limit=limit) def getVersionHistoryTable(self, forceReload=False, reverseOrder=False, getAll=False, revCount=500): @@ -701,8 +702,7 @@ @return: A generator that yields tuples consisting of revision ID, edit date/time, user name and content """ - return self.site().getrevisions(self, withText=True, - older=reverseOrder, limit=None) + return self.site().getrevisions(self, withText=True) def contributingUsers(self): """Return a set of usernames (or IPs) of users who edited this page.""" Modified: branches/rewrite/pywikibot/site.py =================================================================== --- branches/rewrite/pywikibot/site.py 2008-04-11 13:36:23 UTC (rev 5203) +++ branches/rewrite/pywikibot/site.py 2008-04-11 20:29:11 UTC (rev 5204) @@ -100,11 +100,12 @@ self._username = user # following are for use with lock_page and unlock_page methods - self._mutex = threading.Lock() + self._pagemutex = threading.Lock() self._locked_pages = [] pt_min = min(config.minthrottle, config.put_throttle) - self.put_throttle = Throttle(self, pt_min, config.maxthrottle) + self.put_throttle = Throttle(self, pt_min, config.maxthrottle, + verbosedelay=True) self.put_throttle.setDelay(config.put_throttle) gt_min = min(config.minthrottle, config.get_throttle) @@ -203,7 +204,6 @@ else: return self.family().redirect.get(self.language(), None) - def lock_page(self, page, block=True): """Lock page for writing. Must be called before writing any page. @@ -216,7 +216,7 @@ otherwise, raise an exception if page can't be locked """ - self._mutex.acquire() + self._pagemutex.acquire() try: while page in self._locked_pages: if not block: @@ -224,7 +224,7 @@ time.sleep(.25) self._locked_pages.append(page.title(withSection=False)) finally: - self._mutex.release() + self._pagemutex.release() def unlock_page(self, page): """Unlock page. Call as soon as a write operation has completed. @@ -233,11 +233,11 @@ @type page: pywikibot.Page """ - self._mutex.acquire() + self._pagemutex.acquire() try: self._locked_pages.remove(page.title(withSection=False)) finally: - self._mutex.release() + self._pagemutex.release() class APISite(BaseSite): @@ -338,6 +338,7 @@ 14: [u"Category"], 15: [u"Category talk"], } + self.sitelock = threading.Lock() return # ANYTHING BELOW THIS POINT IS NOT YET IMPLEMENTED IN __init__() @@ -600,14 +601,71 @@ "Cannot get category members of non-Category page '%s'" % category.title()) cmtitle = category.title(withSection=False) - cmgen = api.PageGenerator("categorymembers", gcmtitle=cmtitle, + cmgen = api.PageGenerator(u"categorymembers", gcmtitle=cmtitle, gcmprop="ids|title|sortkey") if namespaces is not None: - cmgen.request["gcmnamespace"] = u"|".join(unicode(ns) + cmgen.request[u"gcmnamespace"] = u"|".join(unicode(ns) for ns in namespaces) return cmgen + def getrevisions(self, page=None, getText=False, revids=None, + older=True, limit=None, sysop=False, user=None, + excludeuser=None): + """Retrieve and store revision information. + @param page: retrieve the history of this Page (required unless ids + is specified) + @param getText: if True, retrieve the wiki-text of each revision as + well + @param revids: retrieve only the specified revision ids (required + unless page is specified) + @param older: if True, retrieve newest revisions first; otherwise, + retrieve oldest revisions first + @param limit: if specified, retrieve no more than this number of + revisions (defaults to latest revision only) + @type limit: int + @param user: retrieve only revisions authored by this user + @param excludeuser: retrieve all revisions not authored by this user + @param sysop: if True, switch to sysop account (if available) to + retrieve this page + + """ + if page is None and revids is None: + raise ValueError( + "getrevisions needs either page or revids argument.") + if page is not None: + rvtitle = page.title(withSection=False) + rvgen = api.PropertyGenerator(u"revisions", titles=rvtitle) + else: + ids = u"|".join(unicode(r) for r in revids) + rvgen = api.PropertyGenerator(u"revisions", revids=ids) + if getText: + rvgen.request[u"rvprop"] = \ + u"ids|flags|timestamp|user|comment|content" + if page.section(): + rvgen.request[u"rvsection"] = unicode(page.section()) + if limit: + rvgen.request[u"rvlimit"] = unicode(limit) + if not older: + rvgen.request[u"rvdir"] = u"newer" + if user: + rvgen.request[u"rvuser"] = user + elif excludeuser: + rvgen.request[u"rvexcludeuser"] = excludeuser + # TODO if sysop: + for rev in rvgen: + revision = pywikibot.page.Revision(revid=rev['revid'], + timestamp=rev['timestamp'], + user=rev['user'], + anon=rev.has_key('anon'), + comment=rev.get('comment', u''), + minor=rev.has_key('minor'), + text=rev.get('*', None)) + page._revisions[revision.revid] = revision + if revids is None and limit is None and user is None and excludeuser is None: + page._revid = revision.revid + + #### METHODS NOT IMPLEMENTED YET (but may be delegated to Family object) #### class NotImplementedYet: Modified: branches/rewrite/pywikibot/throttle.py =================================================================== --- branches/rewrite/pywikibot/throttle.py 2008-04-11 13:36:23 UTC (rev 5203) +++ branches/rewrite/pywikibot/throttle.py 2008-04-11 20:29:11 UTC (rev 5204) @@ -35,7 +35,7 @@ """ def __init__(self, site, mindelay=config.minthrottle, maxdelay=config.maxthrottle, - multiplydelay=True): + multiplydelay=True, verbosedelay=False): self.lock = threading.RLock() self.mysite = str(site) self.mindelay = mindelay @@ -48,6 +48,7 @@ self.releasepid = 1800 # Free the process id after this many seconds self.lastwait = 0.0 self.delay = 0 + self.verbosedelay = verbosedelay if multiplydelay: self.checkMultiplicity() self.setDelay(mindelay) @@ -106,9 +107,10 @@ f.write("%(pid)s %(time)s %(site)s\n" % p) f.close() self.process_multiplicity = count - pywikibot.output( + if self.verbosedelay: + pywikibot.output( u"Found %s processes running, including the current process." - % count) + % count) finally: self.lock.release()

1 0

[Pywikipedia-l] SVN: [5203] trunk/pywikipedia/commonsdelinker/delinker.py
by btongminh＠svn.wikimedia.org 11 Apr '08

11 Apr '08

Revision: 5203 Author: btongminh Date: 2008-04-11 13:36:23 +0000 (Fri, 11 Apr 2008) Log Message: ----------- Limit complex replacement to filenames at the start of the template parameter Modified Paths: -------------- trunk/pywikipedia/commonsdelinker/delinker.py Modified: trunk/pywikipedia/commonsdelinker/delinker.py =================================================================== --- trunk/pywikipedia/commonsdelinker/delinker.py 2008-04-09 21:05:03 UTC (rev 5202) +++ trunk/pywikipedia/commonsdelinker/delinker.py 2008-04-11 13:36:23 UTC (rev 5203) @@ -267,7 +267,7 @@ # likely embedded in a complicated template. hook = 'complex' r_templates = ur'(?s)(\{\{.*?\}\})' - r_complicated = u'(?s)((?:%s)?)%s' % (r_namespace, r_image) + r_complicated = u'(?s)[|{=]\s*((?:%s)?)%s' % (r_namespace, r_image) def template_replacer(match): return re.sub(r_complicated, simple_replacer, match.group(1))

1 0

[Pywikipedia-l] Fwd: SVN: [5195] trunk/pywikipedia/wikipedia.py
by Nicolas Dumazet 10 Apr '08

10 Apr '08

> Log Message: > ----------- > Fixing the regex according to the change of HTML Woah. That's a big change, way bigger than the summary states it. > @@ -828,6 +828,7 @@ > def previousRevision(self): > """Return the revision id for the previous revision of this Page.""" > vh = self.getVersionHistory(revCount=2) > + print vh > return vh[1][0] Forgot to remove a debug print ? :) > @@ -1154,7 +1166,7 @@ > force, callback)) > > def put(self, newtext, comment=None, watchArticle=None, minorEdit=True, > - force=False): > + force=False, deleted = True): Please document this new parameter, or rename it. As of now, two interpretations : 1) If the page was deleted, raise an error on creation 2) If the page was deleted ignore the error on creation From what I see, it's #1, but please document this ;) On a sidenote, I think that this is a good thing to detect this, but I don't understand why the default behavior is to raise an EditConflict error. I believe it should not, (as #1 a b/c issue, and #2 because most of the time users do not care !) and we should modify one by one the scripts that could benefit from this detection, if there are any. > @@ -1297,7 +1310,7 @@ > time.sleep(5) > continue > # A second text area means that an edit conflict has occured. > - if 'id=\'wpTextbox2\' name="wpTextbox2"' in data: > + if 'id=\'wpTextbox2\' name="wpTextbox2"' in data and deleted == True: > raise EditConflict(u'An edit conflict has occured.') > if self.site().has_mediawiki_message("spamprotectiontitle")\ > and self.site().mediawiki_message('spamprotectiontitle') in data: Strange ! :) - if 'id=\'wpTextbox2\' name="wpTextbox2"' in data and deleted == True: + if 'id=\'wpTextbox2\' name="wpTextbox2"' in data and deleted: better, maybe ?? -- Nicolas Dumazet — NicDumZ Deuxième année ENSIMAG.

2 1

[Pywikipedia-l] [ pywikipediabot-Bugs-1876637 ] cannot save exclusion db without english in copyright.py
by SourceForge.net 10 Apr '08

10 Apr '08

Bugs item #1876637, was opened at 2008-01-21 18:39 Message generated for change (Settings changed) made by cosoleto You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1876637&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Open Resolution: None Priority: 4 Private: No Submitted By: Alex S.H. Lin (lin4h) Assigned to: Nobody/Anonymous (nobody) Summary: cannot save exclusion db without english in copyright.py Initial Comment: The patch included copyright.py update, before update this patch, there are some bug we have to fix it. when the script get the exclusion database from jawiki, it cannot save into the text file and return error: Updating file 'copyright\ja\jaclone.txt' ([[ja:Wikipedia:ウィキペディアを情報源とするサイト]]) Traceback (most recent call last): File "D:\My Documents\SOURCE\mwbot\pywikipedia\copyright.py", line 1188, in <m odule> excl_list = exclusion_list() File "D:\My Documents\SOURCE\mwbot\pywikipedia\copyright.py", line 360, in exc lusion_list load_pages() File "D:\My Documents\SOURCE\mwbot\pywikipedia\copyright.py", line 319, in loa d_pages wikipedia.config.shortpath(path), page.aslink()) UnicodeEncodeError: 'cp950' codec can't encode character u'\u7ef4' in position 4 9: illegal multibyte sequence I use windows XP SP2 and python 2.5 ---------------------------------------------------------------------- Comment By: Francesco Cosoleto (cosoleto) Date: 2008-04-10 13:39 Message: Logged In: YES user_id=181280 Originator: NO I have applied your patch in my source tree and tried to change my user config too but I cannot reproduce this bug. Is it still present? If it's, please include more information: your user-config.py (without google/yahoo/msn access key), PyWikipediaBot version... Anyway to add support for ja and zh pages additional code it's needed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1876637&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-1876637 ] cannot save exclusion db without english in copyright.py
by SourceForge.net 10 Apr '08

10 Apr '08

Bugs item #1876637, was opened at 2008-01-21 18:39 Message generated for change (Comment added) made by cosoleto You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1876637&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Pending Resolution: None >Priority: 4 Private: No Submitted By: Alex S.H. Lin (lin4h) Assigned to: Nobody/Anonymous (nobody) Summary: cannot save exclusion db without english in copyright.py Initial Comment: The patch included copyright.py update, before update this patch, there are some bug we have to fix it. when the script get the exclusion database from jawiki, it cannot save into the text file and return error: Updating file 'copyright\ja\jaclone.txt' ([[ja:Wikipedia:ウィキペディアを情報源とするサイト]]) Traceback (most recent call last): File "D:\My Documents\SOURCE\mwbot\pywikipedia\copyright.py", line 1188, in <m odule> excl_list = exclusion_list() File "D:\My Documents\SOURCE\mwbot\pywikipedia\copyright.py", line 360, in exc lusion_list load_pages() File "D:\My Documents\SOURCE\mwbot\pywikipedia\copyright.py", line 319, in loa d_pages wikipedia.config.shortpath(path), page.aslink()) UnicodeEncodeError: 'cp950' codec can't encode character u'\u7ef4' in position 4 9: illegal multibyte sequence I use windows XP SP2 and python 2.5 ---------------------------------------------------------------------- >Comment By: Francesco Cosoleto (cosoleto) Date: 2008-04-10 13:39 Message: Logged In: YES user_id=181280 Originator: NO I have applied your patch in my source tree and tried to change my user config too but I cannot reproduce this bug. Is it still present? If it's, please include more information: your user-config.py (without google/yahoo/msn access key), PyWikipediaBot version... Anyway to add support for ja and zh pages additional code it's needed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1876637&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Feature Requests-1939195 ] Translation for simple wiki
by SourceForge.net 10 Apr '08

10 Apr '08

Feature Requests item #1939195, was opened at 2008-04-10 01:01 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1939195&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Interface Improvements (example) Group: None Status: Open Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: Translation for simple wiki Initial Comment: Please add the language entries for Simple English Wikipedia into the scripts. It can be annoying re-entering them after checking out the files. I am Chenzw on Wikipedia. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1939195&group_…

1 0

← Newer
1
...
11
12
13
14
15
16
17
...
22
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot April 2008