pywikibot October 2007

pywikibot@lists.wikimedia.org

24 participants
197 discussions

[Pywikipedia-l] SVN: [4406] trunk/pywikipedia/inline_images.py
by leogregianin＠svn.wikimedia.org 04 Oct '07

04 Oct '07

Revision: 4406 Author: leogregianin Date: 2007-10-03 17:07:26 +0000 (Wed, 03 Oct 2007) Log Message: ----------- pt msg and fix show help Modified Paths: -------------- trunk/pywikipedia/inline_images.py Modified: trunk/pywikipedia/inline_images.py =================================================================== --- trunk/pywikipedia/inline_images.py 2007-10-03 16:48:25 UTC (rev 4405) +++ trunk/pywikipedia/inline_images.py 2007-10-03 17:07:26 UTC (rev 4406) @@ -43,7 +43,8 @@ msg = { 'en': u'This image was inline linked from %s. No information on author, copyright status, or license is available.', - 'pl': u'Obraz ten został dolinkowany z adresu %s. Brak jest informacji o autorze, prawach autorskich czy licencji.' + 'pl': u'Obraz ten został dolinkowany z adresu %s. Brak jest informacji o autorze, prawach autorskich czy licencji.', + 'pt': u'Esta imagem foi inserida como linha de %s. Nenhum infomação sobre autor, direitos autorais ou licença foi listada.', } ################################### @@ -104,7 +105,7 @@ page = wikipedia.Page(wikipedia.getSite(), ' '.join(pageTitle)) gen = iter([page]) if not gen: - wikipedia.showHelp('touch') + wikipedia.showHelp('inline_images') else: preloadingGen = pagegenerators.PreloadingGenerator(gen) bot = InlineImagesRobot(preloadingGen)

1 0

[Pywikipedia-l] [ pywikipediabot-Feature Requests-1800925 ] wikipedia.search and page generator
by SourceForge.net 04 Oct '07

04 Oct '07

Feature Requests item #1800925, was opened at 2007-09-24 03:14 Message generated for change (Settings changed) made by leogregianin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1800925&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed Priority: 5 Private: No Submitted By: John Vandenberg (zeroj) >Assigned to: Leonardo Gregianin (leogregianin) Summary: wikipedia.search and page generator Initial Comment: The MediaWiki search functionality needs to be exposed to the bot framework, especially as a page generator as the Google search depends on an inaccessible feature, and the Yahoo search API is limited to 100 results. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1800925&group_…

1 0

[Pywikipedia-l] SVN: [4405] trunk/pywikipedia
by leogregianin＠svn.wikimedia.org 04 Oct '07

04 Oct '07

Revision: 4405 Author: leogregianin Date: 2007-10-03 16:48:25 +0000 (Wed, 03 Oct 2007) Log Message: ----------- Patch 1800925: wikipedia.search and page generator by John Vandenberg (NOTE: I didn't obtain to function this) Modified Paths: -------------- trunk/pywikipedia/family.py trunk/pywikipedia/pagegenerators.py trunk/pywikipedia/wikipedia.py Modified: trunk/pywikipedia/family.py =================================================================== --- trunk/pywikipedia/family.py 2007-10-03 14:55:17 UTC (rev 4404) +++ trunk/pywikipedia/family.py 2007-10-03 16:48:25 UTC (rev 4405) @@ -2613,6 +2613,30 @@ def api_address(self, code): return '%s?' % self.apipath(code) + def search_address(self, code, query, limit=100, namespaces = None): + """ + Constructs a URL for searching using Special:Search + 'namespaces' may be an int or a list; an empty list selects + all namespaces. Defaults to namespace 0 + """ + namespace_params = '' + if namespaces is not None: + if isinstance(namespaces, int): + namespace_params = "&ns%d=1" % namespaces + elif isinstance (namespaces, list): + if len(namespaces) == 0: + # add all namespaces + namespaces = self.namespaces.keys() + for i in namespaces: + if i > 0: + namespace_params = namespace_params + '&ns%d=1' % i + + return "%s?title=%s:Search&search=%s&limit=%d%s" % (self.path(code), + self.special_namespace_url(code), + query, + limit, + namespace_params) + def allpages_address(self, code, start, namespace = 0): if self.version(code)=="1.2": return '%s?title=%s:Allpages&printable=yes&from=%s' % ( Modified: trunk/pywikipedia/pagegenerators.py =================================================================== --- trunk/pywikipedia/pagegenerators.py 2007-10-03 14:55:17 UTC (rev 4404) +++ trunk/pywikipedia/pagegenerators.py 2007-10-03 16:48:25 UTC (rev 4405) @@ -36,6 +36,9 @@ Depends on python module pYsearch. See yahoo_appid in config.py for instructions. +-search Work on all pages that are found in a MediaWiki search + across all namespaces. + -google Work on all pages that are found in a Google search. You need a Google Web API license key. Note that Google doesn't give out license keys anymore. See google_key in @@ -290,6 +293,15 @@ yield wikipedia.Page(site, pagenameofthelink) offset += step +def SearchPageGenerator(query, number = 100, namespaces = None, site = None): + """ + Provides a list of results using the internal MediaWiki search engine + """ + if site is None: + site = wikipedia.getSite() + for page in site.search(query, number=number, namespaces = namespaces): + yield page[0] + class YahooSearchPageGenerator: ''' To use this generator, install pYsearch @@ -745,6 +757,14 @@ gen = NewpagesPageGenerator(number = int(arg[5:])) else: gen = NewpagesPageGenerator(number = 60) + elif arg.startswith('-search'): + if len(arg) == 8: + mediawikiQuery = wikipedia.input(u'What do you want to search for?') + else: + mediawikiQuery = arg[8:] + # In order to be useful, all namespaces are required + gen = SearchPageGenerator(mediawikiQuery, namespaces = []) + elif arg.startswith('-google'): if len(arg) == 7: googleQuery = wikipedia.input(u'What do you want to search for?') Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2007-10-03 14:55:17 UTC (rev 4404) +++ trunk/pywikipedia/wikipedia.py 2007-10-03 16:48:25 UTC (rev 4405) @@ -3666,6 +3666,44 @@ except KeyError: return False + def search(self, query, number = 10, repeat = False, namespaces = None): + """ + Generator which yields search results + """ + seen = set() + throttle = True + while True: + path = self.search_address(query, n=number, ns = namespaces) + get_throttle() + html = self.getUrl(path) + entryR = re.compile(ur'<li[^>]*><a href=".+?" title="(?P<title>.+?)">.+?</a>' + '(?P<match>.*?)<br ?/><span[^>]*>Relevance: ' + '(?P<relevance>[0-9.]+)% - ' + '(?P<size>[0-9.]+) ' + '(?P<sizeunit>[A-Za-z]+) ' + '\((?P<words>.+?) words\) - ' + '(?P<date>.+?)</span></li>', re.DOTALL) + + for m in entryR.finditer(html): + title = m.group('title') + + if title not in seen: + seen.add(title) + page = Page(self, title) + + match = m.group('match') + relevance = m.group('relevance') + size = m.group('size') + # sizeunit appears to always be "KB" + words = m.group('words') + date = m.group('date') + + #print "%s - %s %s (%s words) - %s" % (relevance, size, sizeunit, words, date) + + yield page, match, relevance, size, words, date + if not repeat: + break + # TODO: avoid code duplication for the following methods def newpages(self, number = 10, get_redirect = False, repeat = False): """Generator which yields new articles subsequently. @@ -4213,6 +4251,9 @@ if self.encoding().lower() != charset.lower(): raise ValueError("code2encodings has wrong charset for %s. It should be %s, but is %s" % (repr(self), charset, self.encoding())) + def search_address(self, q, n=50, ns = 0): + return self.family.search_address(self.lang, q, n, ns) + def allpages_address(self, s, ns = 0): return self.family.allpages_address(self.lang, start = s, namespace = ns)

1 0

[Pywikipedia-l] SVN: [4404] trunk/pywikipedia/pagegenerators.py
by leogregianin＠svn.wikimedia.org 03 Oct '07

03 Oct '07

Revision: 4404 Author: leogregianin Date: 2007-10-03 14:55:17 +0000 (Wed, 03 Oct 2007) Log Message: ----------- Patch 1784625 by Leszek Krupi?\197?\132ski Modified Paths: -------------- trunk/pywikipedia/pagegenerators.py Modified: trunk/pywikipedia/pagegenerators.py =================================================================== --- trunk/pywikipedia/pagegenerators.py 2007-10-03 14:30:15 UTC (rev 4403) +++ trunk/pywikipedia/pagegenerators.py 2007-10-03 14:55:17 UTC (rev 4404) @@ -505,6 +505,17 @@ seenPages.append(page) yield page +def RegexFilterPageGenerator(generator, regex): + """ + Wraps around another generator. Yields only thos pages, which titles are positively + matched to regex. + """ + reg = re.compile(regex, re.I) + + for page in generator: + if reg.match(page.titleWithoutNamespace()): + yield page + def CombinedPageGenerator(generators): """ Wraps around a list of other generators. Yields all pages generated by the @@ -735,11 +746,17 @@ else: gen = NewpagesPageGenerator(number = 60) elif arg.startswith('-google'): - if len(arg) == 8: + if len(arg) == 7: googleQuery = wikipedia.input(u'What do you want to search for?') else: googleQuery = arg[8:] gen = GoogleSearchPageGenerator(googleQuery) + elif arg.startswith('-regex'): + if len(arg) == 6: + regex = wikipedia.input(u'What page names are you looking for?') + else: + regex = arg[7:] + gen = RegexFilterPageGenerator(wikipedia.getSite().allpages(), regex) elif arg.startswith('-yahoo'): if len(arg) == 7: query = wikipedia.input(u'What do you want to search for?')

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1784625 ] Regular expression page generator
by SourceForge.net 03 Oct '07

03 Oct '07

Patches item #1784625, was opened at 2007-08-30 03:41 Message generated for change (Settings changed) made by leogregianin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1784625&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 5 Private: No Submitted By: Leszek Krupiński (leszek_k) Assigned to: Nobody/Anonymous (nobody) Summary: Regular expression page generator Initial Comment: Included patch adds new page generator, which yields pages with title matching to given regular expression. Additionally it fixes small bug with google page generator. ---------------------------------------------------------------------- Comment By: Leszek Krupiński (leszek_k) Date: 2007-10-02 07:32 Message: Logged In: YES user_id=546434 Originator: YES I've updated patch, changing from generator to regex filter. File Added: regexgenerator.patch ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1784625&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1736292 ] Allow autosummary in pagefromfile.py (simple)
by SourceForge.net 03 Oct '07

03 Oct '07

Patches item #1736292, was opened at 2007-06-13 03:48 Message generated for change (Settings changed) made by leogregianin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1736292&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 5 Private: No Submitted By: Purodha B Blissenbach (purodha) Assigned to: Nobody/Anonymous (nobody) Summary: Allow autosummary in pagefromfile.py (simple) Initial Comment: Since Mediawiki offers to generate an automated edit summary for new pages, it may be desirable to let this freature be used in pageromfole.py - specifically with batch uploads, this would result often in more useful individual edit summaries, such as "redirects to [[xyzzy]]", etc. The chance to make this possible is very simple. Here is a patch: 1. In the parameter description section, add two lines resulting in this: -summary:xxx Use xxx as the summary for the upload if xxx is the empty string, and a new page is created, mediawikis autosummary feature generates a summary 2. Few lines before the end of pagefromfile.py, add two lines resulting in this code: elif arg.startswith("-summary:"): commenttext=arg[9:] if commenttext == '': wikipedia.setAction('') else: wikipedia.output(u"Disregarding unknown argument %s." % arg) 3. This is already it. ---------------------------------------------------------------------- Comment By: Francesco Cosoleto (cosoleto) Date: 2007-10-01 14:16 Message: Logged In: YES user_id=181280 Originator: NO Feature added in r4397. ---------------------------------------------------------------------- Comment By: Purodha B Blissenbach (purodha) Date: 2007-06-21 07:11 Message: Logged In: YES user_id=46450 Originator: YES File Added: pagefromfile_summary_autosummary.py ---------------------------------------------------------------------- Comment By: Purodha B Blissenbach (purodha) Date: 2007-06-21 07:08 Message: Logged In: YES user_id=46450 Originator: YES After the latest reworking of pagefromfile.py in cvs, which made the -summary command line parameter useless, my above recommendation was lost, and after reinserting it, did not work any more. Attached is a patch that makes both work as of today. File Added: pagefromfile_summary_autosummary.diff ---------------------------------------------------------------------- Comment By: Purodha B Blissenbach (purodha) Date: 2007-06-15 03:45 Message: Logged In: YES user_id=46450 Originator: YES Yes, sure I tested it, see e.g. the summaries in this edit: http://ksh.wikipedia.org/w/index.php?title=Ihrefeldt_%28K%C3%B6lle%29&actio… and in this edit: http://ksh.wikipedia.org/w/index.php?title=7._S%C3%ABpt%C3%A4mbo&action=his… Both have been created in one bot run. ("Leit öm op" = "Redirects to", localized) When put() gets an *empty* or None comment parameter, it uses the action defined by setAction() which by default is something like "PyWikipediBot framework". This is why I made it empty, when commenttext==''. put(... comment = '' ...) makes Mediawiki use its autosummary feature on new pages. ---------------------------------------------------------------------- Comment By: Daniel Herding (wikipedian) Date: 2007-06-14 06:35 Message: Logged In: YES user_id=880694 Originator: NO Have you tested this? I'm not sure this works, as pages are saved using put(... comment = commenttext ...). When put gets a comment parameter, it will ignore the action defined by the setAction() call. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1736292&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-1773949 ] wrong argument description of movepages.py
by SourceForge.net 03 Oct '07

03 Oct '07

Bugs item #1773949, was opened at 2007-08-14 11:31 Message generated for change (Settings changed) made by leogregianin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1773949&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Pending Resolution: Accepted Priority: 3 Private: No Submitted By: Falk Steinhauer (falk_steinhauer) Assigned to: Leonardo Gregianin (leogregianin) Summary: wrong argument description of movepages.py Initial Comment: I am using snapshot 2007-06-19. The commandline help of movepages.py shows an option called "-addprefix", but the respective option in the sourcecode is spelled "-prefix". Users that are not able to understand the sourcecode might not be able to understand how to call the script. Another thing is, that this script is not using page title highlighting like replace.py when user interaction is desired. The first line in "def treat(self,page)" should be: colors = [None] * 6 + [13] * len(page.title()) + [None] * 4 ---------------------------------------------------------------------- Comment By: Leonardo Gregianin (leogregianin) Date: 2007-09-20 12:38 Message: Logged In: YES user_id=1136737 Originator: NO See this snapshot http://tools.wikimedia.de/~valhallasw/pywiki/ ---------------------------------------------------------------------- Comment By: Falk Steinhauer (falk_steinhauer) Date: 2007-09-13 04:06 Message: Logged In: YES user_id=1810075 Originator: YES Bug still exists in snapshot 2007-08-11. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1773949&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1802206 ] botwiki_family update
by SourceForge.net 03 Oct '07

03 Oct '07

Patches item #1802206, was opened at 2007-09-25 14:35 Message generated for change (Settings changed) made by leogregianin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1802206&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 5 Private: No Submitted By: Snowolf (mlussetti) >Assigned to: Leonardo Gregianin (leogregianin) Summary: botwiki_family update Initial Comment: Botwiki is now running mediawiki 1.11. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1802206&group_…

1 0

[Pywikipedia-l] SVN: [4403] trunk/pywikipedia/families/botwiki_family.py
by leogregianin＠svn.wikimedia.org 03 Oct '07

03 Oct '07

Revision: 4403 Author: leogregianin Date: 2007-10-03 14:30:15 +0000 (Wed, 03 Oct 2007) Log Message: ----------- Patch 1802206 by Snowolf Modified Paths: -------------- trunk/pywikipedia/families/botwiki_family.py Modified: trunk/pywikipedia/families/botwiki_family.py =================================================================== --- trunk/pywikipedia/families/botwiki_family.py 2007-10-03 14:24:58 UTC (rev 4402) +++ trunk/pywikipedia/families/botwiki_family.py 2007-10-03 14:30:15 UTC (rev 4403) @@ -63,7 +63,7 @@ } def version(self, code): - return "1.10.0" + return "1.11.0" def path(self, code): return '/w/index.php'

1 0

[Pywikipedia-l] SVN: [4402] trunk/pywikipedia/clean_sandbox.py
by leogregianin＠svn.wikimedia.org 03 Oct '07

03 Oct '07

Revision: 4402 Author: leogregianin Date: 2007-10-03 14:24:58 +0000 (Wed, 03 Oct 2007) Log Message: ----------- Patch 1805732 by shizhao Modified Paths: -------------- trunk/pywikipedia/clean_sandbox.py Modified: trunk/pywikipedia/clean_sandbox.py =================================================================== --- trunk/pywikipedia/clean_sandbox.py 2007-10-03 12:52:09 UTC (rev 4401) +++ trunk/pywikipedia/clean_sandbox.py 2007-10-03 14:24:58 UTC (rev 4402) @@ -32,7 +32,8 @@ 'no': u'{{Sandkasse}}\n}}', 'pl': u'{{Prosimy - NIE ZMIENIAJ, NIE KASUJ, NIE PRZENOŚ tej linijki - pisz niżej}}', 'pt': u'{{página de testes}}\r\n', - 'commons': u'{{Sandbox}}\n' + 'commons': u'{{Sandbox}}\n', + 'zh': u'{{subst:User:Sz-iwbot/sandbox}}\r\n', } msg = { @@ -43,6 +44,7 @@ 'no': u'bot: Rydder sandkassa.', 'pl': u'Robot czyści brudnopis', 'pt': u'Bot: Limpeza da página de testes', + 'zh': u'Bot: 本页被自动清理', } sandboxTitle = { @@ -53,7 +55,8 @@ 'no': u'Wikipedia:Sandkasse', 'pl': u'Wikipedia:Brudnopis', 'pt': u'Wikipedia:Página de testes', - 'commons': u'Commons:Sandbox' + 'commons': u'Commons:Sandbox', + 'zh': u'wikipedia:沙盒', } class SandboxBot:

1 0

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot October 2007