Pywikipedia-l February 2009

pywikipedia-l@lists.wikimedia.org

22 participants
230 discussions

SVN: [6337] trunk/pywikipedia/families/README-family.txt

by russblau＠svn.wikimedia.org

Revision: 6337 Author: russblau Date: 2009-02-08 11:04:00 +0000 (Sun, 08 Feb 2009) Log Message: ----------- improved some examples Modified Paths: -------------- trunk/pywikipedia/families/README-family.txt Modified: trunk/pywikipedia/families/README-family.txt =================================================================== --- trunk/pywikipedia/families/README-family.txt 2009-02-08 04:14:37 UTC (rev 6336) +++ trunk/pywikipedia/families/README-family.txt 2009-02-08 11:04:00 UTC (rev 6337) @@ -15,8 +15,8 @@ After you copy the text, go through and edit it, based upon the comment lines. First, do a global search-and-replace to change all instances of 'WIKINAME' to your actual wiki name. Everything in the example below is -based on the bot's default settings, except for the namespace names, which -are made-up examples. You only need to change it if your wiki's value is +based on the bot's default settings, except for those that are marked as +examples. You only need to change it if your wiki's value is different from the default. You can delete anything that is not indicated as "REQUIRED", if your new wiki doesn't vary from the default settings. @@ -43,14 +43,14 @@ # You only need to enter translations that differ from the default. # There are two ways of entering namespace translations. # 1. If you only need to change the translation of a particular - # namespace for one or two languages, use this format: + # namespace for one or two languages, use this format (example): self.namespaces[2]['en'] = u'Wikiuser' self.namespaces[3]['en'] = u'Wikiuser talk' # 2. If you need to change the translation for many languages # for the same namespace number, use this format (this is common # for namespaces 4 and 5, because these are usually given a - # unique name for each wiki): + # unique name for each wiki) (example): self.namespaces[4] = { '_default': [u'WIKINAME', self.namespaces[4]['_default']], # REQUIRED 'de': 'Name des wiki', @@ -66,33 +66,29 @@ # On most wikis page names must start with a capital letter, but some # languages don't use this. This should be a list of languages that - # _don't_ require the first letter to be capitalized; e.g., + # _don't_ require the first letter to be capitalized. Example: # self.nocapitalize = ['foo', 'bar'] - self.nocapitalize = [] # SETTINGS FOR WIKIS THAT USE DISAMBIGUATION PAGES: - # A list of disambiguation template names in different languages - self.disambiguationTemplates = { - 'en': ['disambig', 'disambiguation'], - } + # Disambiguation template names in different languages; each value + # must be a list, even if there is only one entry. Example: + # self.disambiguationTemplates['en'] = ['disambig', 'disambiguation'] - # A list with the name of the category containing disambiguation + # The name of the category containing disambiguation # pages for the various languages. Only one category per language, # and without the namespace, so add things like: - self.disambcatname = { - 'en': "Disambiguation", - } + # self.disambcatname['en'] = "Disambiguation" # SETTINGS FOR WIKIS THAT USE INTERLANGUAGE LINKS: # attop is a list of languages that prefer to have the interwiki - # links at the top of the page. - self.interwiki_attop = [] + # links at the top of the page. Example: + # self.interwiki_attop = ['de', 'xz'] # on_one_line is a list of languages that want the interwiki links - # one-after-another on a single line - self.interwiki_on_one_line = [] + # one-after-another on a single line. Example: + # self.interwiki_on_one_line = ['aa', 'cc'] # String used as separator between interwiki links and the text self.interwiki_text_separator = '\r\n\r\n' @@ -171,7 +167,7 @@ # raise NotImplementedError, "%s wiki family does not support api.php" % self.name return '%s/api.php' % self.scriptpath(code) - # Which version of MediaWiki is used? + # Which version of MediaWiki is used? REQUIRED def version(self, code): # Replace with the actual version being run on your wiki return '1.13alpha'

15 years, 3 months

[ pywikipediabot-Bugs-2356220 ] fix newpages()

by SourceForge.net

Bugs item #2356220, was opened at 2008-11-28 11:42 Message generated for change (Settings changed) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2356220&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None >Status: Pending Resolution: None Priority: 7 Private: No Submitted By: shizhao (wikishizhao) Assigned to: Nobody/Anonymous (nobody) Summary: fix newpages() Initial Comment: fix username in newpages() on wikipedia.py ---------------------------------------------------------------------- Comment By: siebrand (siebrand) Date: 2009-01-27 04:11 Message: Are the issue and the patch still current? If so, please confirm, check if the patch still applies, and upload a revised patch. Remove outdated patch(es) in any case. ---------------------------------------------------------------------- Comment By: shizhao (wikishizhao) Date: 2008-11-28 23:30 Message: is python 2.5+. This Mediawiki update add class="mw-userlink" and class="new mw-userlink", so pywikipedia also update. ---------------------------------------------------------------------- Comment By: Legoktm (legoktm) Date: 2008-11-28 23:01 Message: What version of python are you using? I get this error while running 2.3.5, but not with 2.5+. ---------------------------------------------------------------------- Comment By: Legoktm (legoktm) Date: 2008-11-28 23:01 Message: What version of python are you using? I get this error while running 2.3.5, but not with 2.5+. ---------------------------------------------------------------------- Comment By: Legoktm (legoktm) Date: 2008-11-28 22:58 Message: whoops, wrong bug ---------------------------------------------------------------------- Comment By: Legoktm (legoktm) Date: 2008-11-28 22:58 Message: What version of python are you using? I get this error while running 2.3.5, but not with 2.5+. ---------------------------------------------------------------------- Comment By: shizhao (wikishizhao) Date: 2008-11-28 12:07 Message: add fix new user. File Added: wikipedia.diff ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2356220&group_…

15 years, 3 months

[ pywikipediabot-Bugs-1936118 ] Recreating deleted pages failed.

by SourceForge.net

Bugs item #1936118, was opened at 2008-04-06 13:10 Message generated for change (Comment added) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1936118&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Out of Date Priority: 7 Private: No Submitted By: Daniel Herding (wikipedian) Assigned to: Nobody/Anonymous (nobody) Summary: Recreating deleted pages failed. Initial Comment: How to reproduce: Create a page in your browser, then delete it. Then recreate it using PyWikiBot. You will get this error message: File "/home/daniel/projekte/pywikipedia/wikipedia.py", line 1331, in _putPage raise EditConflict(u'Someone deleted the page.') EditConflict: Someone deleted the page. I have then changed this line in Page._putPage(): if '<label for=\'wpRecreate\'' in data: -> if '<label for=\'wpRecreate\'' in data and not newPage: But that doesn't completely solve it because of a new MediaWiki feature. I get this error message: Benutzer <a href="/wiki/Benutzer:Ureinwohner" title="Benutzer:Ureinwohner">Ureinwohner</a> (<a href="/wiki/Benutzer_Diskussion:Ureinwohner" title="Benutzer Diskussion:Ureinwohner">Diskussion</a>) hat diesen Artikel gelöscht, nachdem du angefangen hast, ihn zu bearbeiten. Die Begründung lautete: veraltete bildwarnung Bitte bestätige, dass du diesen Artikel wirklich wieder neu anlegen möchtest. <input tabindex='1' type='checkbox' value='1' name='wpRecreate' id='wpRecreate' /><label for='wpRecreate' title='Wiederherstellen.'>Erneut anlegen</label> ---------------------------------------------------------------------- >Comment By: Russell Blau (russblau) Date: 2009-02-08 05:50 Message: works now (with a page that I confirmed had previously existed and been deleted): >>> import wikipedia Checked for running processes. 1 processes currently running, including the current process. >>> s = wikipedia.getSite() >>> p = wikipedia.Page(s, "User:RussBot/Current list") >>> p.exists() False >>> p.put("{{db-userreq}}", "Test edit") Creating page [[User:RussBot/Current list]] (302, 'Moved Temporarily', u'') >>> wikipedia.stopme() >>> ---------------------------------------------------------------------- Comment By: Daniel Herding (wikipedian) Date: 2008-04-06 18:02 Message: Logged In: YES user_id=880694 Originator: YES Some more info on this. I'm working on the German Wikipedia, on the page [[Diskussion:ICLEI]] which weblinkchecker.py tried to create. The error that I get is [[MediaWiki:Confirmrecreate]]. A comment in wikipedia.py says: # Make sure your system clock is correct if this error occurs # without any reason! My system clock is correct. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1936118&group_…

15 years, 3 months

[ pywikipediabot-Bugs-2562477 ] redirect.py BadTitle Exception Error

by SourceForge.net

Bugs item #2562477, was opened at 2009-02-03 18:30 Message generated for change (Settings changed) made by russblau You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2562477&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Invalid Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: redirect.py BadTitle Exception Error Initial Comment: Running redirect-Bot with -moves option, a BadTitle error raised: Getting page [[Talk:User:Hawkeye7/Sandbox]] Traceback (m r c l): main() bot.run() self.fix_double_redirects() File redirect.py in fix_double_redirects for redir_name in self.generator.retrieve_double_redirects(): File redirect.py in retrieve_double_redirects for redir_page in self.get_moved_pages_redirects(): File redirect.py in get_moved_pages_redirects if not moved_page.isRedirectPage() File wikipedia.py in isRedirectPage self.get() File wikipedia.py in get self._contents = self._getEditPage(get_redirect = get_redirect, throttle = throttle, sysop =sysop= File wikipedia.py in _getEditPage raise BadeTitle('BadTitle: %s' % self) wikipedia.BadTitle: BadTitle: [[Talk:User:Hawkeye7/Sandbox]] <w:de:User:Xqt> ---------------------------------------------------------------------- >Comment By: Russell Blau (russblau) Date: 2009-02-08 05:34 Message: 1. This is not a bug: http://en.wikipedia.org/wiki/Talk:User:Hawkeye7/Sandbox returns a Bad Title page, so the BadTitle exception is correctly raised 2. The Python traceback information in the bug report was edited to remove line numbers, which makes it much more difficult to locate the source of any bug. In future, please copy and paste the program's output without editing (I'm pretty sure that "raise BadeTitle" does not appear anywhere in wikipedia.py). 3. Always include the output of "python version.py" in every bug report. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2562477&group_…

15 years, 3 months

SVN: [6336] trunk/pywikipedia

by purodha＠svn.wikimedia.org

Revision: 6336 Author: purodha Date: 2009-02-08 04:14:37 +0000 (Sun, 08 Feb 2009) Log Message: ----------- Few translated messages added, comments or messages corrected. Modified Paths: -------------- trunk/pywikipedia/add_text.py trunk/pywikipedia/basic.py trunk/pywikipedia/capitalize_redirects.py trunk/pywikipedia/catall.py trunk/pywikipedia/featuredcount.py trunk/pywikipedia/selflink.py Modified: trunk/pywikipedia/add_text.py =================================================================== --- trunk/pywikipedia/add_text.py 2009-02-06 21:57:51 UTC (rev 6335) +++ trunk/pywikipedia/add_text.py 2009-02-08 04:14:37 UTC (rev 6336) @@ -61,6 +61,8 @@ 'he': u'בוט: מוסיף %s', 'it': u'Bot: Aggiungo %s', 'ja': u'ロボットによる: 追加 %s', + 'ksh': u'Bot: dobeijedonn: %s', + 'nds': u'Bot: tofoiegt: %s', 'nn': u'Robot: La til %s', 'pt': u'Bot: Adicionando %s', 'sv': u'Bot: Lägger till %s', Modified: trunk/pywikipedia/basic.py =================================================================== --- trunk/pywikipedia/basic.py 2009-02-06 21:57:51 UTC (rev 6335) +++ trunk/pywikipedia/basic.py 2009-02-08 04:14:37 UTC (rev 6336) @@ -33,6 +33,8 @@ 'de': u'Bot: Ändere ...', 'en': u'Robot: Changing ...', 'ja':u'ロボットによる：編集', + 'ksh': u'Bot: Ännern ...', + 'nds': u'Bot: Änderung ...', 'nl': u'Bot: wijziging ...', 'pt': u'Bot: alterando...', 'sv': u'Bot: Ändrar ...', Modified: trunk/pywikipedia/capitalize_redirects.py =================================================================== --- trunk/pywikipedia/capitalize_redirects.py 2009-02-06 21:57:51 UTC (rev 6335) +++ trunk/pywikipedia/capitalize_redirects.py 2009-02-08 04:14:37 UTC (rev 6336) @@ -41,10 +41,12 @@ msg = { 'ar': u'روبوت: إنشاء تحويلة إلى [[%s]]', + 'de': u'Bot: Weiterleitung angelegt auf [[%s]]', 'en': u'Robot: Create redirect to [[%s]]', 'fr': u'robot: créez redirect à [[%s]]', 'he': u'בוט: יוצר הפניה לדף [[%s]]', 'ja': u'ロボットによる: リダイレクト作成 [[%s]]', + 'ksh': u'Bot: oemleidung aanjelaat op [[%s]]', 'nl': u'Bot: doorverwijzing gemaakt naar [[%s]]', 'pt': u'Bot: Criando redirecionamento para [[%s]]', 'sv': u'Bot: Omdirigerar till [[%s]]', Modified: trunk/pywikipedia/catall.py =================================================================== --- trunk/pywikipedia/catall.py 2009-02-06 21:57:51 UTC (rev 6335) +++ trunk/pywikipedia/catall.py 2009-02-08 04:14:37 UTC (rev 6336) @@ -20,6 +20,7 @@ msg={ 'ar':u'بوت: تغيير التصنيفات', + 'de':u'Bot: Wechsele Kategorien', 'en':u'Bot: Changing categories', 'he':u'Bot: משנה קטגוריות', 'fr':u'Bot: Change categories', @@ -28,6 +29,7 @@ 'it':u'Bot: Cambio categorie', 'ja':u'ロボットによる: カテゴリ変更', 'lt':u'robotas: Keičiamos kategorijos', + 'ksh':u'Bot: Saachjruppe tuusche of dobei donn', 'nl':u'Bot: wijziging van categorieën', 'pl':u'Bot: Zmiana kategorii', 'pt':u'Bot: Categorizando', Modified: trunk/pywikipedia/featuredcount.py =================================================================== --- trunk/pywikipedia/featuredcount.py 2009-02-06 21:57:51 UTC (rev 6335) +++ trunk/pywikipedia/featuredcount.py 2009-02-08 04:14:37 UTC (rev 6336) @@ -1,7 +1,7 @@ #!/usr/bin/python # -*- coding: utf-8 -*- """ -This script only counts how many have featured articles all wikipedias. +This script only counts how many featured articles all wikipedias have. usage: featuredcount.py Modified: trunk/pywikipedia/selflink.py =================================================================== --- trunk/pywikipedia/selflink.py 2009-02-06 21:57:51 UTC (rev 6335) +++ trunk/pywikipedia/selflink.py 2009-02-08 04:14:37 UTC (rev 6336) @@ -44,6 +44,7 @@ 'en':u'Robot: Removing selflinks', 'fr':u'Bot: Enlève autoliens', 'he':u'בוט: מסיר קישורים של הדף לעצמו', + 'ksh':u'Bot: Ene Lengk vun de Sigg op sesch sellver, erus jenumme.', 'nl':u'Bot: verwijzingen naar pagina zelf verwijderd', 'nn':u'robot: fjerna sjølvlenkjer', 'no':u'robot: fjerner selvlenker',

15 years, 3 months

[ pywikipediabot-Bugs-2577598 ] AttributeError: 'NoneType' object has no attribute 'query'

by SourceForge.net

Bugs item #2577598, was opened at 2009-02-07 16:49 Message generated for change (Comment added) made by multichill You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2577598&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mikko Silvonen (silvonen) Assigned to: Nobody/Anonymous (nobody) Summary: AttributeError: 'NoneType' object has no attribute 'query' Initial Comment: My autonomous interwiki run on all fiwiki categories crashed today with the following error. ... ======Post-processing [[fi:Luokka:Cradle of Filth]]====== Updating links on page [[es:Categoría:Cradle of Filth]]. No changes needed Updating links on page [[ja:Category:kureidoru obu fuirusu]]. No changes needed Updating links on page [[sk:Kategória:Cradle of Filth]]. No changes needed Updating links on page [[en:Category:Cradle of Filth]]. No changes needed Updating links on page [[fi:Luokka:Cradle of Filth]]. No changes needed NOTE: The first unfinished subject is [[fi:Luokka:Cradle of Filthin albumit]] NOTE: Number of pages queued is 99, trying to add 60 more. Dump fi (wikipedia) saved Traceback (most recent call last): File "interwiki.py", line 1818, in <module> bot.run() File "interwiki.py", line 1538, in run self.queryStep() File "interwiki.py", line 1512, in queryStep self.oneQuery() File "interwiki.py", line 1480, in oneQuery site = self.selectQuerySite() File "interwiki.py", line 1454, in selectQuerySite self.generateMore(globalvar.maxquerysize - mycount) File "interwiki.py", line 1388, in generateMore page = self.pageGenerator.next() File "c:\svn\pywikipedia\pagegenerators.py", line 670, in NamespaceFilterPageGenerator for page in generator: File "c:\svn\pywikipedia\pagegenerators.py", line 688, in DuplicateFilterPageGenerator for page in generator: File "c:\svn\pywikipedia\pagegenerators.py", line 239, in AllpagesPageGenerator for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects): File "c:\svn\pywikipedia\wikipedia.py", line 5424, in allpages for p in soup.api.query.allpages: AttributeError: 'NoneType' object has no attribute 'query' >python version.py Pywikipedia [http] trunk/pywikipedia (r6334, Feb 06 2009, 16:42:40) Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)] ---------------------------------------------------------------------- >Comment By: Multichill (multichill) Date: 2009-02-07 17:37 Message: I had the same error yesterday. There seems to be something wrong with the allpages generator. The api was changed recently, maybe that has something to do with it. soup = BeautifulSoup(text, convertEntities=BeautifulSoup.HTML_ENTITIES) (line 5421 in wikipedia.py) should return an object. Looks like soup exists, but api doesn't exist. That's strange. When i look at http://commons.wikimedia.org/w/api.php api is the root element. We should probably build in some checks to see if we got everything instead of assuming we get it right straight away. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2577598&group_…

15 years, 3 months

[ pywikipediabot-Bugs-2577598 ] AttributeError: 'NoneType' object has no attribute 'query'

by SourceForge.net

Bugs item #2577598, was opened at 2009-02-07 17:49 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2577598&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: General Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Mikko Silvonen (silvonen) Assigned to: Nobody/Anonymous (nobody) Summary: AttributeError: 'NoneType' object has no attribute 'query' Initial Comment: My autonomous interwiki run on all fiwiki categories crashed today with the following error. ... ======Post-processing [[fi:Luokka:Cradle of Filth]]====== Updating links on page [[es:Categoría:Cradle of Filth]]. No changes needed Updating links on page [[ja:Category:kureidoru obu fuirusu]]. No changes needed Updating links on page [[sk:Kategória:Cradle of Filth]]. No changes needed Updating links on page [[en:Category:Cradle of Filth]]. No changes needed Updating links on page [[fi:Luokka:Cradle of Filth]]. No changes needed NOTE: The first unfinished subject is [[fi:Luokka:Cradle of Filthin albumit]] NOTE: Number of pages queued is 99, trying to add 60 more. Dump fi (wikipedia) saved Traceback (most recent call last): File "interwiki.py", line 1818, in <module> bot.run() File "interwiki.py", line 1538, in run self.queryStep() File "interwiki.py", line 1512, in queryStep self.oneQuery() File "interwiki.py", line 1480, in oneQuery site = self.selectQuerySite() File "interwiki.py", line 1454, in selectQuerySite self.generateMore(globalvar.maxquerysize - mycount) File "interwiki.py", line 1388, in generateMore page = self.pageGenerator.next() File "c:\svn\pywikipedia\pagegenerators.py", line 670, in NamespaceFilterPageGenerator for page in generator: File "c:\svn\pywikipedia\pagegenerators.py", line 688, in DuplicateFilterPageGenerator for page in generator: File "c:\svn\pywikipedia\pagegenerators.py", line 239, in AllpagesPageGenerator for page in site.allpages(start = start, namespace = namespace, includeredirects = includeredirects): File "c:\svn\pywikipedia\wikipedia.py", line 5424, in allpages for p in soup.api.query.allpages: AttributeError: 'NoneType' object has no attribute 'query' >python version.py Pywikipedia [http] trunk/pywikipedia (r6334, Feb 06 2009, 16:42:40) Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)] ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2577598&group_…

15 years, 3 months

SVN: [6335] branches/rewrite/pywikibot/page.py

by russblau＠svn.wikimedia.org

Revision: 6335 Author: russblau Date: 2009-02-06 21:57:51 +0000 (Fri, 06 Feb 2009) Log Message: ----------- add missing message catalog Modified Paths: -------------- branches/rewrite/pywikibot/page.py Modified: branches/rewrite/pywikibot/page.py =================================================================== --- branches/rewrite/pywikibot/page.py 2009-02-06 16:42:40 UTC (rev 6334) +++ branches/rewrite/pywikibot/page.py 2009-02-06 21:57:51 UTC (rev 6335) @@ -1518,6 +1518,26 @@ return sorted(list(set(self.categories()))) +msg_created_for_renaming = { + 'ar':u'روبوت: نقل من %s. المؤلفون: %s', + 'de':u'Bot: Verschoben von %s. Autoren: %s', + 'en':u'Robot: Moved from %s. Authors: %s', + 'fi':u'Botti siirsi luokan %s. Muokkaajat: %s', + 'fr':u'Robot : déplacé depuis %s. Auteurs: %s', + 'he':u'בוט: הועבר מהשם %s. כותבים: %s', + 'ia':u'Robot: Transferite de %s. Autores: %s', + 'id':u'Bot: Memindahkan dari %s. Kontributor: %s', + 'it':u'Bot: Voce spostata da %s. Autori: %s', + 'ja': u'ロボットによる: %s から移動しました。原作者は %s', + 'ksh':u'Bot: hääjeholldt von %s. Schriiver: %s', + 'nds':u'Kat-Bot: herschaven von %s. Schriever: %s', + 'nl':u'Bot: hernoemd van %s. Auteurs: %s', + 'pl':u'Robot przenosi z %s. Autorzy: %s', + 'pt':u'Bot: Movido de %s. Autor: %s', + 'zh':u'機器人: 已從 %s 移動。原作者是 %s', + } + + class Revision(object): """A structure holding information about a single revision of a Page.""" def __init__(self, revid, timestamp, user, anon=False, comment=u"",

15 years, 3 months

SVN: [6334] trunk/pywikipedia/imagerecat.py

by multichill＠svn.wikimedia.org

Revision: 6334 Author: multichill Date: 2009-02-06 16:42:40 +0000 (Fri, 06 Feb 2009) Log Message: ----------- Bugfix and added new option: -onlyuncat Only work on uncategorized images. Will prevent the bot from working on an image multiple times. Modified Paths: -------------- trunk/pywikipedia/imagerecat.py Modified: trunk/pywikipedia/imagerecat.py =================================================================== --- trunk/pywikipedia/imagerecat.py 2009-02-06 16:23:15 UTC (rev 6333) +++ trunk/pywikipedia/imagerecat.py 2009-02-06 16:42:40 UTC (rev 6334) @@ -9,6 +9,8 @@ -onlyfilter Don't use Commonsense to get categories, just filter the current categories +-onlyuncat Only work on uncategorized images. Will prevent the bot from working on an image multiple times. + -hint Give Commonsense a hint. For example -hint:li.wikipedia.org @@ -56,30 +58,34 @@ countries.append(country.titleWithoutNamespace()) return -def categorizeImages(generator, onlyfilter): +def categorizeImages(generator, onlyFilter, onlyUncat): ''' Loop over all images in generator and try to categorize them. Get category suggestions from CommonSense. ''' for page in generator: if page.exists() and (page.namespace() == 6) and (not page.isRedirectPage()): imagepage = wikipedia.ImagePage(page.site(), page.title()) - #imagepage.get() - wikipedia.output(u'Working on ' + imagepage.title()); - currentCats = getCurrentCats(imagepage) - if(onlyfilter): - commonshelperCats = [] - usage = [] - galleries = [] - else: - (commonshelperCats, usage, galleries) = getCommonshelperCats(imagepage) - newcats = applyAllFilters(commonshelperCats+currentCats) + wikipedia.output(u'Working on ' + imagepage.title()) - if (len(newcats) > 0 and not(set(currentCats)==set(newcats))): - for cat in newcats: - wikipedia.output(u' Found new cat: ' + cat); - saveImagePage(imagepage, newcats, usage, galleries, onlyfilter) + if(onlyUncat and not(u'Uncategorized' in imagepage.templates())): + wikipedia.output(u'No Uncategorized template found') + else: + currentCats = getCurrentCats(imagepage) + if(onlyFilter): + commonshelperCats = [] + usage = [] + galleries = [] + else: + (commonshelperCats, usage, galleries) = getCommonshelperCats(imagepage) + newcats = applyAllFilters(commonshelperCats+currentCats) + if (len(newcats) > 0 and not(set(currentCats)==set(newcats))): + for cat in newcats: + wikipedia.output(u' Found new cat: ' + cat); + saveImagePage(imagepage, newcats, usage, galleries, onlyFilter) + + def getCurrentCats(imagepage): ''' Get the categories currently on the image @@ -169,7 +175,7 @@ def applyAllFilters(categories): ''' Apply all filters on categories. - '''' + ''' result = [] result = filterBlacklist(categories) result = filterDisambiguation(result) @@ -275,19 +281,19 @@ return result -def saveImagePage(imagepage, newcats, usage, galleries, onlyfilter): +def saveImagePage(imagepage, newcats, usage, galleries, onlyFilter): ''' Remove the old categories and add the new categories to the image. ''' newtext = wikipedia.removeCategoryLinks(imagepage.get(), imagepage.site()) - if not(onlyfilter): + if not(onlyFilter): newtext = removeTemplates(newtext) newtext = newtext + getCheckCategoriesTemplate(usage, galleries, len(newcats)) for category in newcats: newtext = newtext + u'[[Category:' + category + u']]\n' - if(onlyfilter): + if(onlyFilter): comment = u'Filtering categories' else: comment = u'Image is categorized by a bot using data from [[Commons:Tools#CommonSense|CommonSense]]' @@ -336,7 +342,8 @@ Main loop. Get a generator and options. Work on all images in the generator. ''' generator = None - onlyfilter = False + onlyFilter = False + onlyUncat = False genFactory = pagegenerators.GeneratorFactory() global search_wikis @@ -346,7 +353,9 @@ wikipedia.setSite(site) for arg in wikipedia.handleArgs(): if arg == '-onlyfilter': - onlyfilter = True + onlyFilter = True + elif arg == '-onlyuncat': + onlyUncat = True elif arg.startswith('-hint:'): hint_wiki = arg [len('-hint:'):] elif arg.startswith('-onlyhint'): @@ -359,7 +368,7 @@ generator = pagegenerators.CategorizedPageGenerator(catlib.Category(site, u'Category:Media needing categories'), recurse=True) initLists() - categorizeImages(generator, onlyfilter) + categorizeImages(generator, onlyFilter, onlyUncat) wikipedia.output(u'All done')

15 years, 3 months

SVN: [6333] trunk/pywikipedia/imagerecat.py

by multichill＠svn.wikimedia.org

Revision: 6333 Author: multichill Date: 2009-02-06 16:23:15 +0000 (Fri, 06 Feb 2009) Log Message: ----------- Added -hint and -onlyhint options. Updated some documentation. Modified Paths: -------------- trunk/pywikipedia/imagerecat.py Modified: trunk/pywikipedia/imagerecat.py =================================================================== --- trunk/pywikipedia/imagerecat.py 2009-02-06 14:31:29 UTC (rev 6332) +++ trunk/pywikipedia/imagerecat.py 2009-02-06 16:23:15 UTC (rev 6333) @@ -3,14 +3,27 @@ Program to (re)categorize images at commons. The program uses commonshelper for category suggestions. -It takes the suggestions and the current categories. Put the categories through some filters and add the result +It takes the suggestions and the current categories. Put the categories through some filters and adds the result. +The following command line parameters are supported: + +-onlyfilter Don't use Commonsense to get categories, just filter the current categories + +-hint Give Commonsense a hint. + For example -hint:li.wikipedia.org + +-onlyhint Give Commonsense a hint. And only work on this hint. + Syntax is the same as -hint. Some special hints are possible: + _20 : Work on the top 20 wikipedia's + _80 : Work on the top 80 wikipedia's + wps : Work on all wikipedia's + """ __version__ = '$Id$' # # (C) Multichill 2008 -# (tkinter part loosely based on imagecopy.py) -# Distributed under the terms of the MIT license. +# +# Distributed under the terms of the MIT license. # # import os, sys, re, codecs @@ -24,6 +37,9 @@ category_blacklist = [] countries = [] +search_wikis=u'_20' +hint_wiki=u'' + def initLists(): ''' Get the list of countries & the blacklist from Commons. @@ -81,8 +97,11 @@ commonshelperCats = [] usage = [] galleries = [] + + global search_wikis + global hint_wiki - parameters = urllib.urlencode({'i' : imagepage.titleWithoutNamespace().encode('utf-8'), 'r' : 'on', 'go-clean' : 'Find+Categories', 'cl' : 'li'}) + parameters = urllib.urlencode({'i' : imagepage.titleWithoutNamespace().encode('utf-8'), 'r' : 'on', 'go-clean' : 'Find+Categories', 'p' : search_wikis, 'cl' : hint_wiki}) commonsenseRe = re.compile('^#COMMONSENSE(.*)#USAGE(\s)+$(?P<usagenum>(\d)+)$\s(?P<usage>(.*))\s#KEYWORDS(\s)+$(?P<keywords>(\d)+)$(.*)#CATEGORIES(\s)+$(?P<catnum>(\d)+)$\s(?P<cats>(.*))\s#GALLERIES(\s)+$(?P<galnum>(\d)+)$\s(?P<gals>(.*))\s(.*)#EOF$', re.MULTILINE + re.DOTALL) gotInfo = False; @@ -121,6 +140,9 @@ return (commonshelperCats, usage, galleries) def getUsage(use): + ''' + Parse the Commonsense output to get the usage + ''' result = [] lang = '' project = '' @@ -145,6 +167,9 @@ def applyAllFilters(categories): + ''' + Apply all filters on categories. + '''' result = [] result = filterBlacklist(categories) result = filterDisambiguation(result) @@ -283,6 +308,9 @@ return result def getCheckCategoriesTemplate(usage, galleries, ncats): + ''' + Build the check categories template with all parameters + ''' result = u'{{Check categories|year={{subst:CURRENTYEAR}}|month={{subst:CURRENTMONTHNAME}}|day={{subst:CURRENTDAY}}\n' usageCounter = 1 @@ -305,17 +333,24 @@ def main(args): ''' - Main loop. Get a generator. Set up the 3 threads and the 2 queue's and fire everything up. + Main loop. Get a generator and options. Work on all images in the generator. ''' generator = None onlyfilter = False genFactory = pagegenerators.GeneratorFactory() + global search_wikis + global hint_wiki + site = wikipedia.getSite(u'commons', u'commons') wikipedia.setSite(site) for arg in wikipedia.handleArgs(): if arg == '-onlyfilter': onlyfilter = True + elif arg.startswith('-hint:'): + hint_wiki = arg [len('-hint:'):] + elif arg.startswith('-onlyhint'): + search_wikis = arg [len('-onlyhint:'):] else: genFactory.handleArg(arg)

15 years, 3 months

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Pywikipedia-l February 2009