pywikibot July 2008

pywikibot@lists.wikimedia.org

29 participants
197 discussions

[Pywikipedia-l] [ pywikipediabot-Bugs-2011802 ] cannot find mw messages when deleting page in zhwiki
by SourceForge.net 06 Jul '08

06 Jul '08

Bugs item #2011802, was opened at 2008-07-06 23:18 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2011802&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alex S.H. Lin (lin4h) Assigned to: Nobody/Anonymous (nobody) Summary: cannot find mw messages when deleting page in zhwiki Initial Comment: Pywikipedia [http] trunk/pywikipedia (r5680, Jul 06 2008, 10:31:36) Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] When I use speedy_delete.py to delete pages and images, current code cannot find mediawiki messages "actioncomplete" or "cannotdelete", it will return "Deletion of [[xxx]]] failed for an unknown reason." and print all HTML code.(pages look like "page is delete") I checked current HTML in zhwiki(I only can use delete action in zhwiki) it these messages are not exist. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2011802&group_…

1 0

[Pywikipedia-l] SVN: [5682] trunk/pywikipedia/commonscat.py
by filnik＠svn.wikimedia.org 06 Jul '08

06 Jul '08

Revision: 5682 Author: filnik Date: 2008-07-06 15:01:37 +0000 (Sun, 06 Jul 2008) Log Message: ----------- The patch wasn't working because the page where I tested was changed without my knowledge.. now it works Modified Paths: -------------- trunk/pywikipedia/commonscat.py Modified: trunk/pywikipedia/commonscat.py =================================================================== --- trunk/pywikipedia/commonscat.py 2008-07-06 13:20:50 UTC (rev 5681) +++ trunk/pywikipedia/commonscat.py 2008-07-06 15:01:37 UTC (rev 5682) @@ -154,7 +154,7 @@ ignoreTemplates = { 'en' : [u'Category redirect', u'Commons', u'Commonscat', u'Commonscat1A', u'Commoncats', u'Commonscat4Ra', u'Sisterlinks', u'Sisterlinkswp', u'Tracking category', u'Template category', u'Wikipedia category'], - 'it' : [u'Ip|commons=', 'Interprogetto|commons='], + 'it' : [(u'Ip', 'commons='), ('Interprogetto', 'commons=')], 'ja' : [u'CommonscatS'], 'nl' : [u'Commons'], } @@ -173,9 +173,16 @@ Do we want to skip this page? ''' if ignoreTemplates.has_key(page.site().language()): - for template in page.templates(): - if template in ignoreTemplates[page.site().language()]: - return True + templatesInThePage = page.templates() + templatesWithParams = page.templatesWithParams() + for template in ignoreTemplates[page.site().language()]: + if type(template) != type(tuple()): + if template in templatesInThePage: + return True + else: + for (inPageTemplate, param) in templatesWithParams: + if inPageTemplate == template[0] and template[1] in param[0]: + return True return False def updateInterwiki (wikipediaPage = None, commonsPage = None):

1 0

[Pywikipedia-l] SVN: [5681] trunk/pywikipedia/wikipedia.py
by nicdumz＠svn.wikimedia.org 06 Jul '08

06 Jul '08

Revision: 5681 Author: nicdumz Date: 2008-07-06 13:20:50 +0000 (Sun, 06 Jul 2008) Log Message: ----------- * Overriding urllib.FancyURLopener.http_error_default to catch 403 and 404 errors : Without this, trying to access an unexisting path had strange behavior. In particular, 404 pages using a different encoding than the site' encoding was raising "code2encodings has wrong charset"... * removing useless and misleading Site.charset Modified Paths: -------------- trunk/pywikipedia/wikipedia.py Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2008-07-06 10:31:36 UTC (rev 5680) +++ trunk/pywikipedia/wikipedia.py 2008-07-06 13:20:50 UTC (rev 5681) @@ -5574,12 +5574,11 @@ def checkCharset(self, charset): """Warn if charset returned by wiki doesn't match family file.""" - if not hasattr(self,'charset'): - self.charset = charset - assert self.charset.lower() == charset.lower(), \ + fromFamily = self.encoding() + assert fromFamily.lower() == charset.lower(), \ "charset for %s changed from %s to %s" \ - % (repr(self), self.charset, charset) - if self.encoding().lower() != charset.lower(): + % (repr(self), fromFamily, charset) + if fromFamily.lower() != charset.lower(): raise ValueError( "code2encodings has wrong charset for %s. It should be %s, but is %s" % (repr(self), charset, self.encoding())) @@ -6414,6 +6413,14 @@ class MyURLopener(urllib.FancyURLopener): version="PythonWikipediaBot/1.0" + + def http_error_default(self, url, fp, errcode, errmsg, headers): + if errcode == 401 or errcode == 404: + raise PageNotFound(u'Page %s could not be retrieved. Check your family file ?' % url) + else: + return urllib.FancyURLopener(self, url, fp, errcode, errmsg, headers) + + # Special opener in case we are using a site with authentication if config.authenticate:

1 0

[Pywikipedia-l] SVN: [5680] trunk/pywikipedia/commonscat.py
by filnik＠svn.wikimedia.org 06 Jul '08

06 Jul '08

Revision: 5680 Author: filnik Date: 2008-07-06 10:31:36 +0000 (Sun, 06 Jul 2008) Log Message: ----------- Adding two template to skip for the italian wikipedia Modified Paths: -------------- trunk/pywikipedia/commonscat.py Modified: trunk/pywikipedia/commonscat.py =================================================================== --- trunk/pywikipedia/commonscat.py 2008-07-06 10:27:46 UTC (rev 5679) +++ trunk/pywikipedia/commonscat.py 2008-07-06 10:31:36 UTC (rev 5680) @@ -154,6 +154,7 @@ ignoreTemplates = { 'en' : [u'Category redirect', u'Commons', u'Commonscat', u'Commonscat1A', u'Commoncats', u'Commonscat4Ra', u'Sisterlinks', u'Sisterlinkswp', u'Tracking category', u'Template category', u'Wikipedia category'], + 'it' : [u'Ip|commons=', 'Interprogetto|commons='], 'ja' : [u'CommonscatS'], 'nl' : [u'Commons'], } @@ -173,7 +174,7 @@ ''' if ignoreTemplates.has_key(page.site().language()): for template in page.templates(): - if template in ignoreTemplates[page.site().language()]: + if template in ignoreTemplates[page.site().language()]: return True return False

1 0

[Pywikipedia-l] SVN: [5679] trunk/pywikipedia
by balasyum＠svn.wikimedia.org 06 Jul '08

06 Jul '08

Revision: 5679 Author: balasyum Date: 2008-07-06 10:27:46 +0000 (Sun, 06 Jul 2008) Log Message: ----------- Hungarian localizations Modified Paths: -------------- trunk/pywikipedia/featured.py trunk/pywikipedia/imagecopy.py trunk/pywikipedia/interwiki.py Modified: trunk/pywikipedia/featured.py =================================================================== --- trunk/pywikipedia/featured.py 2008-07-05 22:48:39 UTC (rev 5678) +++ trunk/pywikipedia/featured.py 2008-07-06 10:27:46 UTC (rev 5679) @@ -60,6 +60,7 @@ 'he': u'בוט: קישור לערך מומלץ עבור [[%s:%s]]', 'hr': u'Bot: Interwiki za izabrane članke za [[%s:%s]]', 'hsb': u'Bot: [[%s:%s]] je wuběrny nastawk', + 'hu': u'Bot: a(z) [[%s:%s]] kiemelt szócikk', 'it': u'Bot: collegamento articolo in vetrina [[%s:%s]]', 'ja': u'ロボットによる: 秀逸な項目へのリンク [[%s:%s]]', 'ka': u'ბოტი: რჩეული სტატიის ბმული გვერდისათვის [[%s:%s]]', Modified: trunk/pywikipedia/imagecopy.py =================================================================== --- trunk/pywikipedia/imagecopy.py 2008-07-05 22:48:39 UTC (rev 5678) +++ trunk/pywikipedia/imagecopy.py 2008-07-06 10:27:46 UTC (rev 5679) @@ -190,6 +190,7 @@ 'en': u'File is now available on Wikimedia Commons.', 'eo': u'Dosiero nun estas havebla en la Wikimedia-Komunejo.', 'he': u'הקובץ זמין כעת בוויקישיתוף.', + 'hu': u'A fájl most már elérhető a Wikimedia Commonson.', 'ia': u'Le file es ora disponibile in Wikimedia Commons.', 'it': u'L\'immagine è adesso disponibile su Wikimedia Commons.', 'kk': u'Файлды енді Wikimedia Ортаққорынан қатынауға болады.', @@ -220,6 +221,7 @@ imageMoveMessage = { '_default': u'[[:Image:%s|Image]] moved to [[:commons:Image:%s|commons]].', 'en': u'[[:Image:%s|Image]] moved to [[:commons:Image:%s|commons]].', + 'hu': u'[[:Image:%s|Kép]] átmozgatva a [[:commons:Image:%s|Commons]]ba.', 'nl': u'[[:Image:%s|Afbeelding]] is verplaatst naar [[:commons:Image:%s|commons]].', } Modified: trunk/pywikipedia/interwiki.py =================================================================== --- trunk/pywikipedia/interwiki.py 2008-07-05 22:48:39 UTC (rev 5678) +++ trunk/pywikipedia/interwiki.py 2008-07-06 10:27:46 UTC (rev 5679) @@ -354,7 +354,7 @@ 'hr': (u'robot', u'Dodaje', u'Uklanja', u'Mijenja'), 'hsb': (u'bot ', u'přidał', u'wotstronił', u'změnił'), 'ht': (u'wobo ', u'Ajoute', u'Anlve', u'Modifye'), - 'hu': (u'Robot: ', u'következő hozzáadása', u'következő eltávolítása', u'következő módosítása'), + 'hu': (u'Bot: ', u'következő hozzáadása', u'következő eltávolítása', u'következő módosítása'), 'ia': (u'Robot: ', u'Addition de', u'Elimination de', u'Modification de'), 'id': (u'bot ', u'Menambah', u'Membuang', u'Mengubah'), 'ie': (u'Bot: ', u'Adjuntet', u'Removet', u'Modificat'),

1 0

[Pywikipedia-l] SVN: [5678] trunk/pywikipedia/families/i18n_family.py
by siebrand＠svn.wikimedia.org 05 Jul '08

05 Jul '08

Revision: 5678 Author: siebrand Date: 2008-07-05 22:48:39 +0000 (Sat, 05 Jul 2008) Log Message: ----------- Update version Modified Paths: -------------- trunk/pywikipedia/families/i18n_family.py Modified: trunk/pywikipedia/families/i18n_family.py =================================================================== --- trunk/pywikipedia/families/i18n_family.py 2008-07-05 21:00:34 UTC (rev 5677) +++ trunk/pywikipedia/families/i18n_family.py 2008-07-05 22:48:39 UTC (rev 5678) @@ -35,4 +35,4 @@ } def version(self, code): - return "1.12alpha" + return "1.13alpha"

1 0

[Pywikipedia-l] SVN: [5677] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 05 Jul '08

05 Jul '08

Revision: 5677 Author: filnik Date: 2008-07-05 21:00:34 +0000 (Sat, 05 Jul 2008) Log Message: ----------- Little fix to the data and faked -> fake Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-07-05 20:26:02 UTC (rev 5676) +++ trunk/pywikipedia/checkimages.py 2008-07-05 21:00:34 UTC (rev 5677) @@ -1321,11 +1321,11 @@ seems_ok = True break if not seems_ok: - rep_text_license_faked = "\n*[[:Image:%s]] seems to have a ''fake license'', license detected: %s." % (imageName, license_found) - regexFakedLicense = r"\* ?\[\[:Image:%s\]\] seems to have a ''fake license'', license detected: %s." % (imageName, license_found) + rep_text_license_fake = "\n*[[:Image:%s]] seems to have a ''fake license'', license detected: %s." % (imageName, license_found) + regexFakeLicense = r"\* ?\[\[:Image:%s\]\] seems to have a ''fake license'', license detected: %s." % (imageName, license_found) printWithTimeZone(u"%s seems to have a fake license: %s, reporting..." % (imageName, license_found)) - mainClass.report_image(imageName, rep_text = rep_text_license_faked, - addings = False, regex = regexFakedLicense) + mainClass.report_image(imageName, rep_text = rep_text_license_fake, + addings = False, regex = regexFakeLicense) else: seems_ok = True if seems_ok:

1 0

[Pywikipedia-l] SVN: [5676] trunk/pywikipedia/checkimages.py
by filnik＠svn.wikimedia.org 05 Jul '08

05 Jul '08

Revision: 5676 Author: filnik Date: 2008-07-05 20:26:02 +0000 (Sat, 05 Jul 2008) Log Message: ----------- Beta version, but working pretty good, of the smartdetection Modified Paths: -------------- trunk/pywikipedia/checkimages.py Modified: trunk/pywikipedia/checkimages.py =================================================================== --- trunk/pywikipedia/checkimages.py 2008-07-05 18:21:03 UTC (rev 5675) +++ trunk/pywikipedia/checkimages.py 2008-07-05 20:26:02 UTC (rev 5676) @@ -22,6 +22,8 @@ -duplicatesreport - Report the duplicates in a log *AND* put the template in the images. + -smartdetection - Check in a category if the license found exist in realit or not. + -sendemail - Send an email after tagging. -break - To break the bot after the first check (default: recursive) @@ -308,7 +310,7 @@ u'dupe', u'duplicate', u'uncat', u'uncategorized', u'watermark', u'nocat', u'imageupload'], 'de':[u'information'], 'en':[u'information'], - 'it':[u'edp', u'informazioni[ _]file', u'information', u'trademark'], + 'it':[u'edp', u'informazioni[ _]file', u'information', u'trademark', u'permissionotrs'], 'ja':[u'Information'], 'hu':[u'információ', u'enwiki', u'azonnali'], 'ta':[u'information'], @@ -356,6 +358,11 @@ 'it':r'\{\{(?:[Tt]emplate:|)[Cc]ancella[ _]subito[|}]', } +category_with_licenses = { + 'commons':'Category:License tags', + 'it':'Categoria:Template Licenze copyright', + } + ## Put None if you don't use this option or simply add nothing if en ## is still None. # Page where is stored the message to send as email to the users @@ -447,14 +454,14 @@ """ Constructor, define some global variable """ self.site = site self.logFulNumber = logFulNumber - self.settings = wikipedia.translate(site, page_with_settings) - self.rep_page = wikipedia.translate(site, report_page) - self.rep_text = wikipedia.translate(site, report_text) - self.com = wikipedia.translate(site, comm10) + self.settings = wikipedia.translate(self.site, page_with_settings) + self.rep_page = wikipedia.translate(self.site, report_page) + self.rep_text = wikipedia.translate(self.site, report_text) + self.com = wikipedia.translate(self.site, comm10) # Commento = Summary in italian self.commento = wikipedia.translate(self.site, comm) # Adding the bot's nickname at the notification text if needed. - botolist = wikipedia.translate(wikipedia.getSite(), bot_list) + botolist = wikipedia.translate(self.site, bot_list) project = wikipedia.getSite().family.name bot = config.usernames[project] botnick = bot[self.site.lang] @@ -807,13 +814,13 @@ return False # The image is a duplicate, it will be deleted. return True # Ok - No problem. Let's continue the checking phase - def report_image(self, image, rep_page = None, com = None, rep_text = None, addings = True, regex = None): + def report_image(self, image_to_report, rep_page = None, com = None, rep_text = None, addings = True, regex = None): """ Function to report the images in the report page when needed. """ if rep_page == None: rep_page = self.rep_page if com == None: com = self.com if rep_text == None: rep_text = self.rep_text another_page = wikipedia.Page(self.site, rep_page) - if regex == None: regex = image + if regex == None: regex = image_to_report if another_page.exists(): text_get = another_page.get() else: @@ -821,25 +828,24 @@ if len(text_get) >= self.logFulNumber: raise LogIsFull("The log page (%s) is full! Please delete the old images reported." % another_page.title()) pos = 0 - # The talk page includes "_" between the two names, in this way i replace them to " " + # The talk page includes "_" between the two names, in this way i replace them to " " n = re.compile(regex, re.UNICODE|re.M) y = n.search(text_get, pos) if y == None: # Adding the log if addings: - rep_text = rep_text % image # Adding the name of the image in the report if not done already + rep_text = rep_text % image_to_report # Adding the name of the image in the report if not done already another_page.put(text_get + rep_text, comment = com, minorEdit = False) wikipedia.output(u"...Reported...") reported = True else: pos = y.end() - wikipedia.output(u"%s is already in the report page." % image) + wikipedia.output(u"%s is already in the report page." % image_to_report) reported = False return reported def takesettings(self): """ Function to take the settings from the wiki. """ - pos = 0 if self.settings == None: lista = None else: x = wikipedia.Page(self.site, self.settings) @@ -849,32 +855,41 @@ rxp = r"<------- ------->\n\*[Nn]ame ?= ?['\"](.*?)['\"]\n\*([Ff]ind|[Ff]indonly)=(.*?)\n\*[Ii]magechanges=(.*?)\n\*[Ss]ummary=['\"](.*?)['\"]\n\*[Hh]ead=['\"](.*?)['\"]\n\*[Tt]ext ?= ?['\"](.*?)['\"]\n\*[Mm]ex ?= ?['\"]?(.*?)['\"]?$" r = re.compile(rxp, re.UNICODE|re.M) number = 1 - while 1: - m = r.search(testo, pos) - if m == None: - if lista == list(): - wikipedia.output(u"You've set wrongly your settings, please take a look to the relative page. (run without them)") - lista = None - else: - break - else: - pos = m.end() - name = str(m.group(1)) - find_tipe = str(m.group(2)) - find = str(m.group(3)) - imagechanges = str(m.group(4)) - summary = str(m.group(5)) - head = str(m.group(6)) - text = str(m.group(7)) - mexcatched = str(m.group(8)) - tupla = [number, name, find_tipe, find, imagechanges, summary, head, text, mexcatched] - lista += [tupla] - number += 1 + for m in r.finditer(testo): + name = str(m.group(1)) + find_tipe = str(m.group(2)) + find = str(m.group(3)) + imagechanges = str(m.group(4)) + summary = str(m.group(5)) + head = str(m.group(6)) + text = str(m.group(7)) + mexcatched = str(m.group(8)) + tupla = [number, name, find_tipe, find, imagechanges, summary, head, text, mexcatched] + lista += [tupla] + number += 1 + if lista == list(): + wikipedia.output(u"You've set wrongly your settings, please take a look to the relative page. (run without them)") + lista = None except wikipedia.NoPage: wikipedia.output(u"The settings' page doesn't exist!") lista = None return lista - + + def load_licenses(self): + """ Load the list of the licenses """ + catName = wikipedia.translate(self.site, category_with_licenses) + cat = catlib.Category(wikipedia.getSite(), catName) + categories = [page.title() for page in pagegenerators.SubCategoriesPageGenerator(cat)] + categories.append(catName) + list_licenses = list() + wikipedia.output(u'\n\t...Loading the names of the licenses allowed...\n') + for catName in categories: + cat = catlib.Category(wikipedia.getSite(), catName) + gen = pagegenerators.CategorizedPageGenerator(cat) + pages = [page for page in gen] + list_licenses.extend(pages) + return list_licenses + def load(self, raw): """ Load a list of object from a string using regex. """ list_loaded = list() @@ -885,11 +900,6 @@ regl = r"(?:\"|\')(.*?)(?:\"|\')(?:, |\])" pl = re.compile(regl, re.UNICODE) for xl in pl.finditer(raw): - if xl == None: - if len(list_loaded) >= 1: - return list_loaded - break - pos = xl.end() word = xl.group(1) if word not in list_loaded: list_loaded.append(word) @@ -911,8 +921,9 @@ skip_list = list() # Inizialize the skip list used below duplicatesActive = False # Use the duplicate option duplicatesReport = False # Use the duplicate-report option - sendemailActive = False # Use the send-email option - + sendemailActive = False # Use the send-email + smartdetection = False # Enable the smart detection + # Here below there are the parameters. for arg in wikipedia.handleArgs(): if arg.startswith('-limit'): @@ -935,6 +946,8 @@ duplicatesReport = True elif arg == '-sendemail': sendemailActive = True + elif arg == '-smartdetection': + smartdetection = True elif arg.startswith('-skip'): if len(arg) == 5: skip = True @@ -999,7 +1012,7 @@ projectUntagged = str(wikipedia.input(u'In which project should I work?')) elif len(arg) > 9: projectUntagged = str(arg[10:]) - + # Understand if the generator it's the default or not. try: generator @@ -1090,6 +1103,8 @@ wikipedia.output(u'Problems with loading the settigs, run without them.') tupla_written = None some_problem = False + # Load the list of licenses allowed for our project + list_licenses = mainClass.load_licenses() # Ensure that if the list given is empty it will be converted to "None" # (but it should be already done in the takesettings() function) if tupla_written == []: tupla_written = None @@ -1200,7 +1215,8 @@ white_template_found += 1 if l != '' and l != ' ': # Check that l is not nothing or a space # Deleting! (replace the template with nothing) - g = re.sub(r'\{\{(?:template:|)%s' % l.lower(), r'', g.lower()) + regex_white_template = re.compile(r'\{\{(?:template:|)%s' % l, re.IGNORECASE) + g = regex_white_template.sub(r'', g) hiddenTemplateFound = True if white_template_found == 1: wikipedia.output(u'A white template found, skipping the template...') @@ -1289,9 +1305,36 @@ some_problem = False continue elif parentesi == True: - printWithTimeZone(u"%s seems ok," % imageName) + seems_ok = False + license_found = None + if smartdetection: + regex_find_licenses = re.compile(r'\{\{(?:[Tt]emplate:|)(.*?)(?:[|\n].*?|)\}\}', re.DOTALL) + licenses_found = regex_find_licenses.findall(g) + if licenses_found != []: + for license_selected in licenses_found: + #print template.exists() + template = wikipedia.Page(site, 'Template:%s' % license_selected) + if template.isRedirectPage(): + template = template.getRedirectTarget() + license_found = license_selected + if template in list_licenses: + seems_ok = True + break + if not seems_ok: + rep_text_license_faked = "\n*[[:Image:%s]] seems to have a ''fake license'', license detected: %s." % (imageName, license_found) + regexFakedLicense = r"\* ?\[\[:Image:%s\]\] seems to have a ''fake license'', license detected: %s." % (imageName, license_found) + printWithTimeZone(u"%s seems to have a fake license: %s, reporting..." % (imageName, license_found)) + mainClass.report_image(imageName, rep_text = rep_text_license_faked, + addings = False, regex = regexFakedLicense) + else: + seems_ok = True + if seems_ok: + if license_found != None: + printWithTimeZone(u"%s seems ok, license found: %s..." % (imageName, license_found)) + else: + printWithTimeZone(u"%s seems ok..." % imageName) # It works also without this... but i want only to be sure ^^ - parentesi = False + parentesi = False continue elif delete == True: wikipedia.output(u"%s is not a file!" % imageName)

1 0

[Pywikipedia-l] [ pywikipediabot-Bugs-2006208 ] Undo decodeEsperantoX removal (r5563)
by SourceForge.net 05 Jul '08

05 Jul '08

Bugs item #2006208, was opened at 2008-06-29 23:58 Message generated for change (Comment added) made by melancholie You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2006208&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 9 Private: No Submitted By: Melancholie (melancholie) Assigned to: Nobody/Anonymous (nobody) Summary: Undo decodeEsperantoX removal (r5563) Initial Comment: I do not know why siebrand had problems with decodeEsperantoX(), see http://eo.wikipedia.org/w/index.php?title=Sunfloro&diff=1775705&oldid=15592… - but now I do have that problem (not with interwiki.py, but with featured.py and replace.py)! See http://eo.wikipedia.org/w/index.php?title=Vitamino_C&diff=prev&oldid=1819462 Characters are replaced to gxx, sxx etc., making templates unusable etc... As decodeEsperantoX() had been used for a very long time, please undo that change: http://svn.wikimedia.org/viewvc/pywikipedia/trunk/pywikipedia/wikipedia.py?… ---------------------------------------------------------------------- >Comment By: Melancholie (melancholie) Date: 2008-07-05 20:37 Message: Logged In: YES user_id=2089773 Originator: YES Maybe this was needed before 2004 when Wikimedia wiki didn't had UTF-8 as encoding!? Do you know whether the example "Bordeauxx instead Bordeaux" that was mentioned does work, currently? If yes, I would say we could remove all the remaining de/encodeEsperantoX() cruft, too. (description, *def* (function itself)). ---------------------------------------------------------------------- Comment By: NicDumZ Nicolas Dumazet (nicdumz) Date: 2008-07-04 20:31 Message: Logged In: YES user_id=1963242 Originator: NO encodeEsperantoX calls were removed in r5670. I'm leaving this open, since I bet there was a reason for that X writing convention. Does removing it complety is problematic ? ---------------------------------------------------------------------- Comment By: Melancholie (melancholie) Date: 2008-07-04 15:05 Message: Logged In: YES user_id=2089773 Originator: YES The problem seems to be that decodeEsperantoX() has been removed, but encodeEsperantoX() remains. So the encodeEsperantoX() parts currently produce 'xx'! Either revert the SVN change, or also ?remove? encodeEsperantoX(). ---------------------------------------------------------------------- Comment By: Melancholie (melancholie) Date: 2008-06-30 00:19 Message: Logged In: YES user_id=2089773 Originator: YES See also: * http://eo.wikipedia.org/wiki/Vikipedio:Diskutejo#.22Laboro.22_de_robotistoj * http://eo.wikipedia.org/w/index.php?title=Historio_de_Unui%C4%9Dinta_Re%C4%… ---------------------------------------------------------------------- Comment By: Melancholie (melancholie) Date: 2008-06-30 00:00 Message: Logged In: YES user_id=2089773 Originator: YES http://eo.wikipedia.org/wiki/Speciala:Contributions/Melancholie (r5558 works; r5639 is broken)! ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2006208&group_…

1 0

[Pywikipedia-l] SVN: [5675] trunk/pywikipedia
by siebrand＠svn.wikimedia.org 05 Jul '08

05 Jul '08

Revision: 5675 Author: siebrand Date: 2008-07-05 18:21:03 +0000 (Sat, 05 Jul 2008) Log Message: ----------- svn:eol-style:native Modified Paths: -------------- trunk/pywikipedia/archive/mediawiki_messages.py trunk/pywikipedia/category_redirect.py trunk/pywikipedia/commonsdelinker/plugins.txt trunk/pywikipedia/families/README-family.txt trunk/pywikipedia/protect.py Property Changed: ---------------- trunk/pywikipedia/archive/mediawiki_messages.py trunk/pywikipedia/category_redirect.py trunk/pywikipedia/commonsdelinker/plugins.txt trunk/pywikipedia/families/README-family.txt trunk/pywikipedia/protect.py Modified: trunk/pywikipedia/archive/mediawiki_messages.py =================================================================== --- trunk/pywikipedia/archive/mediawiki_messages.py 2008-07-05 12:57:13 UTC (rev 5674) +++ trunk/pywikipedia/archive/mediawiki_messages.py 2008-07-05 18:21:03 UTC (rev 5675) @@ -1,218 +1,218 @@ -# -*- coding: utf-8 -*- -""" -Allows access to the MediaWiki messages, that's the label texts of the MediaWiki -software in the current language. These can be used in other bots. - -The function refresh_messages() downloads all the current messages and saves -them to disk. It is run automatically when a bot first tries to access one of -the messages. It can be updated manually by running this script, e.g. when -somebody changed the current message at the wiki. The texts will also be -reloaded automatically once a month. - -Syntax: python mediawiki_messages [-all] - -Command line options: - -refresh - Reloads messages for the home wiki or for the one defined via - the -lang and -family parameters. - - -all - Reloads messages for all wikis where messages are already present - - If another parameter is given, it will be interpreted as a MediaWiki key. - The script will then output the respective value, without refreshing.. - -""" - -# (C) Daniel Herding, 2004 -# -# Distributed under the terms of the MIT license. - -##THIS MODULE IS DEPRECATED AND HAS BEEN REPLACED BY NEW FUNCTIONALITY IN -##WIKIPEDIA.PY. It is being retained solely for compatibility in case any -##custom-written bots rely upon it. Bot authors should replace any uses -##of this module as follows: -## -## OLD: mediawiki_messages.get(key, site) -## NEW: site.mediawiki_message(key) -## -## OLD: mediawiki_messages.has(key, site) -## NEW: site.has_mediawiki_message(key) -## -## OLD: mediawiki_messages.makepath(path) -## NEW: wikipedia.makepath(path) -## -########################################################################## - -import warnings -warnings.warn( -"""The mediawiki_messages module is deprecated and no longer -maintained; see the source code for new methods to replace -calls to this module.""", - DeprecationWarning, stacklevel=2) - - -import wikipedia -import re, sys, pickle -import os.path -import time -import codecs -import urllib -from BeautifulSoup import * - -__version__='$Id: mediawiki_messages.py 3731 2007-06-20 14:42:55Z russblau $' - -loaded = {} - -def get(key, site = None, allowreload = True): - site = site or wikipedia.getSite() - if loaded.has_key(site): - # Use cached copy if it exists. - dictionary = loaded[site] - else: - fn = 'mediawiki-messages/mediawiki-messages-%s-%s.dat' % (site.family.name, site.lang) - try: - # find out how old our saved dump is (in seconds) - file_age = time.time() - os.path.getmtime(fn) - # if it's older than 1 month, reload it - if file_age > 30 * 24 * 60 * 60: - print 'Current MediaWiki message dump is one month old, reloading' - refresh_messages(site) - except OSError: - # no saved dumped exists yet - refresh_messages(site) - f = open(fn, 'r') - dictionary = pickle.load(f) - f.close() - loaded[site] = dictionary - key = key[0].lower() + key[1:] - if dictionary.has_key(key): - return dictionary[key] - elif allowreload: - refresh_messages(site = site) - return get(key, site = site, allowreload = False) - else: - raise KeyError('MediaWiki Key %s not found' % key) - -def has(key, site = None, allowreload = True): - try: - get(key, site, allowreload) - return True - except KeyError: - return False - -def makepath(path): - """ creates missing directories for the given path and - returns a normalized absolute version of the path. - - - if the given path already exists in the filesystem - the filesystem is not modified. - - - otherwise makepath creates directories along the given path - using the dirname() of the path. You may append - a '/' to the path if you want it to be a directory path. - - from holger(a)trillke.net 2002/03/18 - """ - from os import makedirs - from os.path import normpath,dirname,exists,abspath - - dpath = normpath(dirname(path)) - if not exists(dpath): makedirs(dpath) - return normpath(abspath(path)) - -def refresh_messages(site = None): - site = site or wikipedia.getSite() - # get 'all messages' special page's path - path = site.allmessages_address() - print 'Retrieving MediaWiki messages for %s' % repr(site) - wikipedia.put_throttle() # It actually is a get, but a heavy one. - allmessages = site.getUrl(path) - - print 'Parsing MediaWiki messages' - soup = BeautifulSoup(allmessages, - convertEntities=BeautifulSoup.HTML_ENTITIES) - # The MediaWiki namespace in URL-encoded format, as it can contain - # non-ASCII characters and spaces. - quotedMwNs = urllib.quote(site.namespace(8).replace(' ', '_').encode(site.encoding())) - mw_url = site.path() + "?title=" + quotedMwNs + ":" - altmw_url = site.path() + "/" + quotedMwNs + ":" - nicemw_url = site.nice_get_address(quotedMwNs + ":") - shortmw_url = "/" + quotedMwNs + ":" - ismediawiki = lambda url:url and (url.startswith(mw_url) - or url.startswith(altmw_url) - or url.startswith(nicemw_url) - or url.startswith(shortmw_url)) - # we will save the found key:value pairs here - dictionary = {} - - try: - for keytag in soup('a', href=ismediawiki): - # Key strings only contain ASCII characters, so we can save them as - # strs - key = str(keytag.find(text=True)) - keyrow = keytag.parent.parent - if keyrow['class'] == "orig": - valrow = keyrow.findNextSibling('tr') - assert valrow['class'] == "new" - value = unicode(valrow.td.string).strip() - elif keyrow['class'] == 'def': - value = unicode(keyrow('td')[1].string).strip() - else: - raise AssertionError("Unknown tr class value: %s" % keyrow['class']) - dictionary[key] = value - except Exception, e: - wikipedia.debugDump( 'MediaWiki_Msg', site, u'%s: %s while processing URL: %s' % (repr(e), str(e), unicode(path)), allmessages) - raise - - # Save the dictionary to disk - # The file is stored in the mediawiki_messages subdir. Create if necessary. - if dictionary == {}: - wikipedia.debugDump( 'MediaWiki_Msg', site, u'Error URL: '+unicode(path), allmessages ) - sys.exit() - else: - f = open(makepath('mediawiki-messages/mediawiki-messages-%s-%s.dat' % (site.family.name, site.lang)), 'w') - pickle.dump(dictionary, f) - f.close() - print "Loaded %i values from %s" % (len(dictionary.keys()), site) - #print dictionary['sitestatstext'] - -def refresh_all_messages(): - import dircache, time - filenames = dircache.listdir('mediawiki-messages') - message_filenameR = re.compile('mediawiki-messages-([a-z:]+)-([a-z:]+).dat') - for filename in filenames: - match = message_filenameR.match(filename) - if match: - family = match.group(1) - lang = match.group(2) - site = wikipedia.getSite(code = lang, fam = family) - refresh_messages(site) - -def main(): - refresh_all = False - refresh = False - key = None - for arg in wikipedia.handleArgs(): - if arg == '-all': - refresh_all = True - elif arg == '-refresh': - refresh = True - else: - key = arg - if key: - wikipedia.output(get(key), toStdout = True) - elif refresh_all: - refresh_all_messages() - elif refresh: - refresh_messages(wikipedia.getSite()) - else: - wikipedia.showHelp('mediawiki_messages') - -if __name__ == "__main__": - try: - main() - except: - wikipedia.stopme() - raise - else: - wikipedia.stopme() - +# -*- coding: utf-8 -*- +""" +Allows access to the MediaWiki messages, that's the label texts of the MediaWiki +software in the current language. These can be used in other bots. + +The function refresh_messages() downloads all the current messages and saves +them to disk. It is run automatically when a bot first tries to access one of +the messages. It can be updated manually by running this script, e.g. when +somebody changed the current message at the wiki. The texts will also be +reloaded automatically once a month. + +Syntax: python mediawiki_messages [-all] + +Command line options: + -refresh - Reloads messages for the home wiki or for the one defined via + the -lang and -family parameters. + + -all - Reloads messages for all wikis where messages are already present + + If another parameter is given, it will be interpreted as a MediaWiki key. + The script will then output the respective value, without refreshing.. + +""" + +# (C) Daniel Herding, 2004 +# +# Distributed under the terms of the MIT license. + +##THIS MODULE IS DEPRECATED AND HAS BEEN REPLACED BY NEW FUNCTIONALITY IN +##WIKIPEDIA.PY. It is being retained solely for compatibility in case any +##custom-written bots rely upon it. Bot authors should replace any uses +##of this module as follows: +## +## OLD: mediawiki_messages.get(key, site) +## NEW: site.mediawiki_message(key) +## +## OLD: mediawiki_messages.has(key, site) +## NEW: site.has_mediawiki_message(key) +## +## OLD: mediawiki_messages.makepath(path) +## NEW: wikipedia.makepath(path) +## +########################################################################## + +import warnings +warnings.warn( +"""The mediawiki_messages module is deprecated and no longer +maintained; see the source code for new methods to replace +calls to this module.""", + DeprecationWarning, stacklevel=2) + + +import wikipedia +import re, sys, pickle +import os.path +import time +import codecs +import urllib +from BeautifulSoup import * + +__version__='$Id: mediawiki_messages.py 3731 2007-06-20 14:42:55Z russblau $' + +loaded = {} + +def get(key, site = None, allowreload = True): + site = site or wikipedia.getSite() + if loaded.has_key(site): + # Use cached copy if it exists. + dictionary = loaded[site] + else: + fn = 'mediawiki-messages/mediawiki-messages-%s-%s.dat' % (site.family.name, site.lang) + try: + # find out how old our saved dump is (in seconds) + file_age = time.time() - os.path.getmtime(fn) + # if it's older than 1 month, reload it + if file_age > 30 * 24 * 60 * 60: + print 'Current MediaWiki message dump is one month old, reloading' + refresh_messages(site) + except OSError: + # no saved dumped exists yet + refresh_messages(site) + f = open(fn, 'r') + dictionary = pickle.load(f) + f.close() + loaded[site] = dictionary + key = key[0].lower() + key[1:] + if dictionary.has_key(key): + return dictionary[key] + elif allowreload: + refresh_messages(site = site) + return get(key, site = site, allowreload = False) + else: + raise KeyError('MediaWiki Key %s not found' % key) + +def has(key, site = None, allowreload = True): + try: + get(key, site, allowreload) + return True + except KeyError: + return False + +def makepath(path): + """ creates missing directories for the given path and + returns a normalized absolute version of the path. + + - if the given path already exists in the filesystem + the filesystem is not modified. + + - otherwise makepath creates directories along the given path + using the dirname() of the path. You may append + a '/' to the path if you want it to be a directory path. + + from holger(a)trillke.net 2002/03/18 + """ + from os import makedirs + from os.path import normpath,dirname,exists,abspath + + dpath = normpath(dirname(path)) + if not exists(dpath): makedirs(dpath) + return normpath(abspath(path)) + +def refresh_messages(site = None): + site = site or wikipedia.getSite() + # get 'all messages' special page's path + path = site.allmessages_address() + print 'Retrieving MediaWiki messages for %s' % repr(site) + wikipedia.put_throttle() # It actually is a get, but a heavy one. + allmessages = site.getUrl(path) + + print 'Parsing MediaWiki messages' + soup = BeautifulSoup(allmessages, + convertEntities=BeautifulSoup.HTML_ENTITIES) + # The MediaWiki namespace in URL-encoded format, as it can contain + # non-ASCII characters and spaces. + quotedMwNs = urllib.quote(site.namespace(8).replace(' ', '_').encode(site.encoding())) + mw_url = site.path() + "?title=" + quotedMwNs + ":" + altmw_url = site.path() + "/" + quotedMwNs + ":" + nicemw_url = site.nice_get_address(quotedMwNs + ":") + shortmw_url = "/" + quotedMwNs + ":" + ismediawiki = lambda url:url and (url.startswith(mw_url) + or url.startswith(altmw_url) + or url.startswith(nicemw_url) + or url.startswith(shortmw_url)) + # we will save the found key:value pairs here + dictionary = {} + + try: + for keytag in soup('a', href=ismediawiki): + # Key strings only contain ASCII characters, so we can save them as + # strs + key = str(keytag.find(text=True)) + keyrow = keytag.parent.parent + if keyrow['class'] == "orig": + valrow = keyrow.findNextSibling('tr') + assert valrow['class'] == "new" + value = unicode(valrow.td.string).strip() + elif keyrow['class'] == 'def': + value = unicode(keyrow('td')[1].string).strip() + else: + raise AssertionError("Unknown tr class value: %s" % keyrow['class']) + dictionary[key] = value + except Exception, e: + wikipedia.debugDump( 'MediaWiki_Msg', site, u'%s: %s while processing URL: %s' % (repr(e), str(e), unicode(path)), allmessages) + raise + + # Save the dictionary to disk + # The file is stored in the mediawiki_messages subdir. Create if necessary. + if dictionary == {}: + wikipedia.debugDump( 'MediaWiki_Msg', site, u'Error URL: '+unicode(path), allmessages ) + sys.exit() + else: + f = open(makepath('mediawiki-messages/mediawiki-messages-%s-%s.dat' % (site.family.name, site.lang)), 'w') + pickle.dump(dictionary, f) + f.close() + print "Loaded %i values from %s" % (len(dictionary.keys()), site) + #print dictionary['sitestatstext'] + +def refresh_all_messages(): + import dircache, time + filenames = dircache.listdir('mediawiki-messages') + message_filenameR = re.compile('mediawiki-messages-([a-z:]+)-([a-z:]+).dat') + for filename in filenames: + match = message_filenameR.match(filename) + if match: + family = match.group(1) + lang = match.group(2) + site = wikipedia.getSite(code = lang, fam = family) + refresh_messages(site) + +def main(): + refresh_all = False + refresh = False + key = None + for arg in wikipedia.handleArgs(): + if arg == '-all': + refresh_all = True + elif arg == '-refresh': + refresh = True + else: + key = arg + if key: + wikipedia.output(get(key), toStdout = True) + elif refresh_all: + refresh_all_messages() + elif refresh: + refresh_messages(wikipedia.getSite()) + else: + wikipedia.showHelp('mediawiki_messages') + +if __name__ == "__main__": + try: + main() + except: + wikipedia.stopme() + raise + else: + wikipedia.stopme() + Property changes on: trunk/pywikipedia/archive/mediawiki_messages.py ___________________________________________________________________ Name: svn:eol-style + native Modified: trunk/pywikipedia/category_redirect.py =================================================================== --- trunk/pywikipedia/category_redirect.py 2008-07-05 12:57:13 UTC (rev 5674) +++ trunk/pywikipedia/category_redirect.py 2008-07-05 18:21:03 UTC (rev 5675) @@ -1,71 +1,71 @@ -#!/usr/bin/python -# -*- coding: utf-8 -*- -""" -Script to clean up http://commons.wikimedia.org/wiki/Category:Non-empty_category_redirects - -Moves all images, pages and categories in redirect categories to the target category. - -""" - -# -# (C) Multichill, 2008 -# -# Distributed under the terms of the MIT license. -# - -import wikipedia, config, catlib -from category import * - -redirect_templates = [u'Category redirect', u'Categoryredirect', u'See cat', u'Seecat', u'Catredirect', u'Cat redirect', u'CatRed', u'Catredir'] -move_message = u'Moving from [[%s|%s]] to [[%s|%s]] (following [[Template:Category redirect|category redirect]])' - -def get_redirect_cat(category=None): - ''' - Return the target category - ''' - destination = None - site = wikipedia.getSite(u'commons', u'commons') - for template in category.templatesWithParams(): - if ((template[0] in redirect_templates) and (len(template[1]) > 0)): - #destination = template[1][0]; - destination =catlib.Category(site, template[1][0]) - if not destination.exists(): - return None - return destination - - -def main(): - ''' - Main loop. Loop over all categories of Category:Non-empty_category_redirects and move all content. - ''' - - site = wikipedia.getSite(u'commons', u'commons') - dirtycat = catlib.Category(site, u'Category:Non-empty category redirects') - destination = None - catbot = None - - for old_category in dirtycat.subcategories(): - destination = get_redirect_cat(old_category) - if destination: - wikipedia.output(destination.title()) - for page in old_category.articles(): - try: - catlib.change_category(page, old_category, destination, move_message % (old_category.title(), old_category.titleWithoutNamespace(), destination.title(), destination.titleWithoutNamespace())) - except wikipedia.IsRedirectPage: - wikipedia.output(page.title() + u' is a redirect!') - for cat in old_category.subcategories(): - try: - catlib.change_category(cat, old_category, destination, move_message % (old_category.title(), old_category.titleWithoutNamespace(), destination.title(), destination.titleWithoutNamespace())) - except wikipedia.IsRedirectPage: - wikipedia.output(page.title() + u' is a redirect!') - #Dummy edit to refresh the page, shouldnt show up in any logs. - try: - old_category.put(old_category.get()) - except: - wikipedia.output(u'Dummy edit at ' + old_category.title() + u' failed') - -if __name__ == "__main__": - try: - main() - finally: - wikipedia.stopme() +#!/usr/bin/python +# -*- coding: utf-8 -*- +""" +Script to clean up http://commons.wikimedia.org/wiki/Category:Non-empty_category_redirects + +Moves all images, pages and categories in redirect categories to the target category. + +""" + +# +# (C) Multichill, 2008 +# +# Distributed under the terms of the MIT license. +# + +import wikipedia, config, catlib +from category import * + +redirect_templates = [u'Category redirect', u'Categoryredirect', u'See cat', u'Seecat', u'Catredirect', u'Cat redirect', u'CatRed', u'Catredir'] +move_message = u'Moving from [[%s|%s]] to [[%s|%s]] (following [[Template:Category redirect|category redirect]])' + +def get_redirect_cat(category=None): + ''' + Return the target category + ''' + destination = None + site = wikipedia.getSite(u'commons', u'commons') + for template in category.templatesWithParams(): + if ((template[0] in redirect_templates) and (len(template[1]) > 0)): + #destination = template[1][0]; + destination =catlib.Category(site, template[1][0]) + if not destination.exists(): + return None + return destination + + +def main(): + ''' + Main loop. Loop over all categories of Category:Non-empty_category_redirects and move all content. + ''' + + site = wikipedia.getSite(u'commons', u'commons') + dirtycat = catlib.Category(site, u'Category:Non-empty category redirects') + destination = None + catbot = None + + for old_category in dirtycat.subcategories(): + destination = get_redirect_cat(old_category) + if destination: + wikipedia.output(destination.title()) + for page in old_category.articles(): + try: + catlib.change_category(page, old_category, destination, move_message % (old_category.title(), old_category.titleWithoutNamespace(), destination.title(), destination.titleWithoutNamespace())) + except wikipedia.IsRedirectPage: + wikipedia.output(page.title() + u' is a redirect!') + for cat in old_category.subcategories(): + try: + catlib.change_category(cat, old_category, destination, move_message % (old_category.title(), old_category.titleWithoutNamespace(), destination.title(), destination.titleWithoutNamespace())) + except wikipedia.IsRedirectPage: + wikipedia.output(page.title() + u' is a redirect!') + #Dummy edit to refresh the page, shouldnt show up in any logs. + try: + old_category.put(old_category.get()) + except: + wikipedia.output(u'Dummy edit at ' + old_category.title() + u' failed') + +if __name__ == "__main__": + try: + main() + finally: + wikipedia.stopme() Property changes on: trunk/pywikipedia/category_redirect.py ___________________________________________________________________ Name: svn:eol-style + native Modified: trunk/pywikipedia/commonsdelinker/plugins.txt =================================================================== --- trunk/pywikipedia/commonsdelinker/plugins.txt 2008-07-05 12:57:13 UTC (rev 5674) +++ trunk/pywikipedia/commonsdelinker/plugins.txt 2008-07-05 18:21:03 UTC (rev 5675) @@ -1,53 +1,53 @@ -CommonsDelinker supports a plugin system, which allows modifying the delink and -replace parameters on a case by case basis. - -Plugins should be registered in the configuration file. CommonsDelinker expects -the configuration value CommonsDelinker['plugins'] to be an iterable object. -The items of this iterable should be module.object strings of the plugin. The -plugin is expected to reside as module.py in commonsdelinker/plugins. The -object should exist and should be a callable object or type with an attribute -'hook' being a string indicating the hook name. - -Some parameters are modyfiable by the plugin. Those include the mutable objects -and some immutable object wrapped in an ImmutableByReference object. The value -of such an object can be get/set by the get and set method. Modyfiable -parameters are preceded by an ampersand & in this documentation. - -A hook that gives False as return value will terminate the hook chain and for -most hooks also terminate the caller. - -== List of hooks and their parameters == - -before_delink(image, usage, timestamp, admin, reason, replacement) - Called once per image. Returing False will cancel delinking this image. - -simple_replace(page, summary, image, &replacement, match, groups) -gallery_replace(page, summary, image, &replacement, match, groups) -complex_replace(page, summary, image, &replacement, match, groups) - Called each time an occerence is to be replaced. Returning False will not - replace this occerence. - -before_save(page, text, &new_text, &summary) - Called before the page is saved. Returning False will not save the page. - -after_delink(image, usage, timestamp, admin, reason, replacement) - Called once per image after delink. - -== Example == -# Saves a diff for every delink. -import difflib - -class Diff(object): - hook = 'before_save' - def __init__(self, CommonsDelinker): - self.CommonsDelinker = CommonsDelinker - def __call__(self, page, text, new_text, summary): - diff = difflib.context_diff( - text.encode('utf-8').splitlines(True), - new_text.get().encode('utf-8').splitlines(True)) - - f = open((u'diff/%s-%s-%s.txt' % (page.urlname().replace('/', '-'), - page.site().dbName(), page.editTime())).encode('utf-8', 'ignore'), 'w') - - f.writelines(diff) - f.close() \ No newline at end of file +CommonsDelinker supports a plugin system, which allows modifying the delink and +replace parameters on a case by case basis. + +Plugins should be registered in the configuration file. CommonsDelinker expects +the configuration value CommonsDelinker['plugins'] to be an iterable object. +The items of this iterable should be module.object strings of the plugin. The +plugin is expected to reside as module.py in commonsdelinker/plugins. The +object should exist and should be a callable object or type with an attribute +'hook' being a string indicating the hook name. + +Some parameters are modyfiable by the plugin. Those include the mutable objects +and some immutable object wrapped in an ImmutableByReference object. The value +of such an object can be get/set by the get and set method. Modyfiable +parameters are preceded by an ampersand & in this documentation. + +A hook that gives False as return value will terminate the hook chain and for +most hooks also terminate the caller. + +== List of hooks and their parameters == + +before_delink(image, usage, timestamp, admin, reason, replacement) + Called once per image. Returing False will cancel delinking this image. + +simple_replace(page, summary, image, &replacement, match, groups) +gallery_replace(page, summary, image, &replacement, match, groups) +complex_replace(page, summary, image, &replacement, match, groups) + Called each time an occerence is to be replaced. Returning False will not + replace this occerence. + +before_save(page, text, &new_text, &summary) + Called before the page is saved. Returning False will not save the page. + +after_delink(image, usage, timestamp, admin, reason, replacement) + Called once per image after delink. + +== Example == +# Saves a diff for every delink. +import difflib + +class Diff(object): + hook = 'before_save' + def __init__(self, CommonsDelinker): + self.CommonsDelinker = CommonsDelinker + def __call__(self, page, text, new_text, summary): + diff = difflib.context_diff( + text.encode('utf-8').splitlines(True), + new_text.get().encode('utf-8').splitlines(True)) + + f = open((u'diff/%s-%s-%s.txt' % (page.urlname().replace('/', '-'), + page.site().dbName(), page.editTime())).encode('utf-8', 'ignore'), 'w') + + f.writelines(diff) + f.close() Property changes on: trunk/pywikipedia/commonsdelinker/plugins.txt ___________________________________________________________________ Name: svn:eol-style + native Modified: trunk/pywikipedia/families/README-family.txt =================================================================== --- trunk/pywikipedia/families/README-family.txt 2008-07-05 12:57:13 UTC (rev 5674) +++ trunk/pywikipedia/families/README-family.txt 2008-07-05 18:21:03 UTC (rev 5675) @@ -1,183 +1,183 @@ -How to create a new family file to add a new wiki to the bot framework. - -(c) 2008, the Pywikipediabot team - -Copy and paste the text below "COPY HERE" into your favorite text editor, and -save it as WIKINAME_family.py in the families/ subdirectory. Replace -WIKINAME with the name you want to use for the new wiki family, making sure -that it doesn't duplicate any existing name. - -A "family" is any group of wikis located on the same server; usually they -are versions of the same type of content in different languages, but this -isn't required. A family can consist of just one wiki, or more; if there is -more than one wiki, each wiki needs to be identified by a unique code. - -After you copy the text, go through and edit it, based upon the comment -lines. First, do a global search-and-replace to change all instances of -'WIKINAME' to your actual wiki name. Everything in the example below is -based on the bot's default settings, except for the namespace names, which -are made-up examples. You only need to change it if your wiki's value is -different from the default. You can delete anything that is not indicated as -"REQUIRED", if your new wiki doesn't vary from the default settings. - -== COPY HERE == - -# -*- coding: utf-8 -*- # REQUIRED -import config, family, urllib # REQUIRED - -class Family(family.Family): # REQUIRED - def __init__(self): # REQUIRED - family.Family.__init__(self) # REQUIRED - self.name = 'WIKINAME' # REQUIRED; replace with actual name - - self.langs = { # REQUIRED - 'en': 'www.example.com', # Include one line for each wiki in family - 'fr': 'www.example.fr', # in the format 'code': 'hostname', - } - - # Translation used on all wikis for the different namespaces. - # Most namespaces are inherited from family.Family. - # Check the family.py file (in main directory) to see the standard - # namespace translations for each known language. - - # You only need to enter translations that differ from the default. - # There are two ways of entering namespace translations. - # 1. If you only need to change the translation of a particular - # namespace for one or two languages, use this format: - self.namespaces[2]['en'] = u'Wikiuser' - self.namespaces[3]['en'] = u'Wikiuser talk' - - # 2. If you need to change the translation for many languages - # for the same namespace number, use this format (this is common - # for namespaces 4 and 5, because these are usually given a - # unique name for each wiki): - self.namespaces[4] = { - '_default': [u'WIKINAME', self.namespaces[4]['_default']], # REQUIRED - 'de': 'Name des wiki', - 'es': 'Nombre del wiki', - 'fr': 'Nom du wiki', - # ETC. - } - - # Wikimedia wikis all use "bodyContent" as the id of the <div> - # element that contains the actual page content; change this for - # wikis that use something else (e.g., mozilla family) - self.content_id = "bodyContent" - - # On most wikis page names must start with a capital letter, but some - # languages don't use this. This should be a list of languages that - # _don't_ require the first letter to be capitalized; e.g., - # self.nocapitalize = ['foo', 'bar'] - self.nocapitalize = [] - - # SETTINGS FOR WIKIS THAT USE DISAMBIGUATION PAGES: - - # A list of disambiguation template names in different languages - self.disambiguationTemplates = { - 'en': ['disambig', 'disambiguation'], - } - - # A list with the name of the category containing disambiguation - # pages for the various languages. Only one category per language, - # and without the namespace, so add things like: - self.disambcatname = { - 'en': "Disambiguation", - } - - # SETTINGS FOR WIKIS THAT USE INTERLANGUAGE LINKS: - - # attop is a list of languages that prefer to have the interwiki - # links at the top of the page. - self.interwiki_attop = [] - - # on_one_line is a list of languages that want the interwiki links - # one-after-another on a single line - self.interwiki_on_one_line = [] - - # String used as separator between interwiki links and the text - self.interwiki_text_separator = '\r\n\r\n' - - # Which languages have a special order for putting interlanguage links, - # and what order is it? If a language is not in interwiki_putfirst, - # alphabetical order on language code is used. For languages that are in - # interwiki_putfirst, interwiki_putfirst is checked first, and - # languages are put in the order given there. All other languages are put - # after those, in code-alphabetical order. - self.interwiki_putfirst = {} - - # Languages in interwiki_putfirst_doubled should have a number plus a list - # of languages. If there are at least the number of interwiki links, all - # languages in the list should be placed at the front as well as in the - # normal list. - self.interwiki_putfirst_doubled = {} - - # Some families, e. g. commons and meta, are not multilingual and - # forward interlanguage links to another family (wikipedia). - # These families can set this variable to the name of the target - # family. - self.interwiki_forward = None - - # Which language codes no longer exist and by which language code - # should they be replaced. If for example the language with code xx: - # has been replaced by code yy:, add {'xx':'yy'} to obsolete. - # If all links to language xx: should be removed, add {'xx': None}. - self.obsolete = {} - - # SETTINGS FOR CATEGORY LINKS: - - # Languages that want the category links at the top of the page - self.category_attop = [] - - # languages that want the category links - # one-after-another on a single line - self.category_on_one_line = [] - - # String used as separator between category links and the text - self.category_text_separator = '\r\n\r\n' - - # When both at the bottom should categories come after interwikilinks? - self.categories_last = [] - - # SETTINGS FOR LDAP AUTHENTICATION - # If your wiki uses: - # http://www.mediawiki.org/wiki/Extension:LDAP_Authentication. - # then uncomment this line and define the user's domain required - # at login. - #self.name = 'domain here' - - def protocol(self, code): - """ - Can be overridden to return 'https'. Other protocols are not supported. - """ - return 'http' - - def scriptpath(self, code): - """The prefix used to locate scripts on this wiki. - - This is the value displayed when you enter {{SCRIPTPATH}} on a - wiki page (often displayed at [[Help:Variables]] if the wiki has - copied the master help page correctly). - - The default value is the one used on Wikimedia Foundation wikis, - but needs to be overridden in the family file for any wiki that - uses a different value. - - """ - return '/w' - - # IMPORTANT: if your wiki does not support the api.php interface, - # you must uncomment the second line of this method: - def apipath(self, code): - # raise NotImplementedError, "%s wiki family does not support api.php" % self.name - return '%s/api.php' % self.scriptpath(code) - - # Which version of MediaWiki is used? - def version(self, code): - # Replace with the actual version being run on your wiki - return '1.13alpha' - - def code2encoding(self, code): - """Return the encoding for a specific language wiki""" - # Most wikis nowadays use UTF-8, but change this if yours uses - # a different encoding - return 'utf-8' +How to create a new family file to add a new wiki to the bot framework. + +(c) 2008, the Pywikipediabot team + +Copy and paste the text below "COPY HERE" into your favorite text editor, and +save it as WIKINAME_family.py in the families/ subdirectory. Replace +WIKINAME with the name you want to use for the new wiki family, making sure +that it doesn't duplicate any existing name. + +A "family" is any group of wikis located on the same server; usually they +are versions of the same type of content in different languages, but this +isn't required. A family can consist of just one wiki, or more; if there is +more than one wiki, each wiki needs to be identified by a unique code. + +After you copy the text, go through and edit it, based upon the comment +lines. First, do a global search-and-replace to change all instances of +'WIKINAME' to your actual wiki name. Everything in the example below is +based on the bot's default settings, except for the namespace names, which +are made-up examples. You only need to change it if your wiki's value is +different from the default. You can delete anything that is not indicated as +"REQUIRED", if your new wiki doesn't vary from the default settings. + +== COPY HERE == + +# -*- coding: utf-8 -*- # REQUIRED +import config, family, urllib # REQUIRED + +class Family(family.Family): # REQUIRED + def __init__(self): # REQUIRED + family.Family.__init__(self) # REQUIRED + self.name = 'WIKINAME' # REQUIRED; replace with actual name + + self.langs = { # REQUIRED + 'en': 'www.example.com', # Include one line for each wiki in family + 'fr': 'www.example.fr', # in the format 'code': 'hostname', + } + + # Translation used on all wikis for the different namespaces. + # Most namespaces are inherited from family.Family. + # Check the family.py file (in main directory) to see the standard + # namespace translations for each known language. + + # You only need to enter translations that differ from the default. + # There are two ways of entering namespace translations. + # 1. If you only need to change the translation of a particular + # namespace for one or two languages, use this format: + self.namespaces[2]['en'] = u'Wikiuser' + self.namespaces[3]['en'] = u'Wikiuser talk' + + # 2. If you need to change the translation for many languages + # for the same namespace number, use this format (this is common + # for namespaces 4 and 5, because these are usually given a + # unique name for each wiki): + self.namespaces[4] = { + '_default': [u'WIKINAME', self.namespaces[4]['_default']], # REQUIRED + 'de': 'Name des wiki', + 'es': 'Nombre del wiki', + 'fr': 'Nom du wiki', + # ETC. + } + + # Wikimedia wikis all use "bodyContent" as the id of the <div> + # element that contains the actual page content; change this for + # wikis that use something else (e.g., mozilla family) + self.content_id = "bodyContent" + + # On most wikis page names must start with a capital letter, but some + # languages don't use this. This should be a list of languages that + # _don't_ require the first letter to be capitalized; e.g., + # self.nocapitalize = ['foo', 'bar'] + self.nocapitalize = [] + + # SETTINGS FOR WIKIS THAT USE DISAMBIGUATION PAGES: + + # A list of disambiguation template names in different languages + self.disambiguationTemplates = { + 'en': ['disambig', 'disambiguation'], + } + + # A list with the name of the category containing disambiguation + # pages for the various languages. Only one category per language, + # and without the namespace, so add things like: + self.disambcatname = { + 'en': "Disambiguation", + } + + # SETTINGS FOR WIKIS THAT USE INTERLANGUAGE LINKS: + + # attop is a list of languages that prefer to have the interwiki + # links at the top of the page. + self.interwiki_attop = [] + + # on_one_line is a list of languages that want the interwiki links + # one-after-another on a single line + self.interwiki_on_one_line = [] + + # String used as separator between interwiki links and the text + self.interwiki_text_separator = '\r\n\r\n' + + # Which languages have a special order for putting interlanguage links, + # and what order is it? If a language is not in interwiki_putfirst, + # alphabetical order on language code is used. For languages that are in + # interwiki_putfirst, interwiki_putfirst is checked first, and + # languages are put in the order given there. All other languages are put + # after those, in code-alphabetical order. + self.interwiki_putfirst = {} + + # Languages in interwiki_putfirst_doubled should have a number plus a list + # of languages. If there are at least the number of interwiki links, all + # languages in the list should be placed at the front as well as in the + # normal list. + self.interwiki_putfirst_doubled = {} + + # Some families, e. g. commons and meta, are not multilingual and + # forward interlanguage links to another family (wikipedia). + # These families can set this variable to the name of the target + # family. + self.interwiki_forward = None + + # Which language codes no longer exist and by which language code + # should they be replaced. If for example the language with code xx: + # has been replaced by code yy:, add {'xx':'yy'} to obsolete. + # If all links to language xx: should be removed, add {'xx': None}. + self.obsolete = {} + + # SETTINGS FOR CATEGORY LINKS: + + # Languages that want the category links at the top of the page + self.category_attop = [] + + # languages that want the category links + # one-after-another on a single line + self.category_on_one_line = [] + + # String used as separator between category links and the text + self.category_text_separator = '\r\n\r\n' + + # When both at the bottom should categories come after interwikilinks? + self.categories_last = [] + + # SETTINGS FOR LDAP AUTHENTICATION + # If your wiki uses: + # http://www.mediawiki.org/wiki/Extension:LDAP_Authentication. + # then uncomment this line and define the user's domain required + # at login. + #self.name = 'domain here' + + def protocol(self, code): + """ + Can be overridden to return 'https'. Other protocols are not supported. + """ + return 'http' + + def scriptpath(self, code): + """The prefix used to locate scripts on this wiki. + + This is the value displayed when you enter {{SCRIPTPATH}} on a + wiki page (often displayed at [[Help:Variables]] if the wiki has + copied the master help page correctly). + + The default value is the one used on Wikimedia Foundation wikis, + but needs to be overridden in the family file for any wiki that + uses a different value. + + """ + return '/w' + + # IMPORTANT: if your wiki does not support the api.php interface, + # you must uncomment the second line of this method: + def apipath(self, code): + # raise NotImplementedError, "%s wiki family does not support api.php" % self.name + return '%s/api.php' % self.scriptpath(code) + + # Which version of MediaWiki is used? + def version(self, code): + # Replace with the actual version being run on your wiki + return '1.13alpha' + + def code2encoding(self, code): + """Return the encoding for a specific language wiki""" + # Most wikis nowadays use UTF-8, but change this if yours uses + # a different encoding + return 'utf-8' Property changes on: trunk/pywikipedia/families/README-family.txt ___________________________________________________________________ Name: svn:eol-style + native Modified: trunk/pywikipedia/protect.py =================================================================== --- trunk/pywikipedia/protect.py 2008-07-05 12:57:13 UTC (rev 5674) +++ trunk/pywikipedia/protect.py 2008-07-05 18:21:03 UTC (rev 5675) @@ -1,252 +1,252 @@ -# -*- coding: utf-8 -*- -""" -This script can be used to protect and unprotect pages en masse. -Of course, you will need an admin account on the relevant wiki. - -Syntax: python protect.py OPTION... - -Command line options: - --page: Protect specified page --cat: Protect all pages in the given category. --nosubcats: Don't protect pages in the subcategories. --links: Protect all pages linked from a given page. --file: Protect all pages listed in a text file. --ref: Protect all pages referring from a given page. --images: Protect all images used on a given page. --always: Don't prompt to protect pages, just do it. --summary: Supply a custom edit summary. --unprotect: Actually unprotect pages instead of protecting --edit:PROTECTION_LEVEL Set edit protection level to PROTECTION_LEVEL --move:PROTECTION_LEVEL Set move protection level to PROTECTION_LEVEL - -## Without support ## -## -create:PROTECTION_LEVEL Set move protection level to PROTECTION_LEVEL ## - -Values for PROTECTION_LEVEL are: sysop, autoconfirmed, none. -If an operation parameter (edit, move or create) is not specified, default -protection level is 'sysop' (or 'none' if -unprotect). - -Examples: - -Protect everything in the category "To protect" prompting. - python protect.py -cat:"To protect" -always - -Unprotect all pages listed in text file "unprotect.txt" without prompting. - python protect.py -file:unprotect.txt -unprotect -""" - -# Written by http://it.wikisource.org/wiki/Utente:Qualc1 -# Created by modifying delete.py -__version__ = '$Id: delete.py 4946 2008-01-29 14:58:25Z wikipedian $' - -# -# Distributed under the terms of the MIT license. -# - -import wikipedia, catlib -import pagegenerators - -# Summary messages for protecting from a category. -msg_simple_protect = { - 'en': u'Bot: Protecting a list of files.', - 'ar': u'بوت: حماية قائمة من الملفات.', - 'it': u'Bot: Protezione di una lista di pagine.', - 'pt': u'Bot: Protegendo uma lista de artigos.', -} -msg_protect_category = { - 'en': u'Robot - Protecting all pages from category %s', - 'ar': u'روبوت - حماية كل الصفحات من التصنيف %s', - 'it': u'Bot: Protezione di tutte le pagine nella categoria %s.', - 'pt': u'Bot: Protegendo todos os artigos da categoria %s', -} -msg_protect_links = { - 'en': u'Robot - Protecting all pages linked from %s', - 'ar': u'روبوت - حماية كل الصفحات الموصولة من %s', - 'it': u'Bot: Protezione di tutte le pagine linkate da %s.', - 'pt': u'Bot: Protegendo todos os artigos ligados a %s', -} -msg_protect_ref = { - 'en': u'Robot - Protecting all pages referring from %s', - 'ar': u'روبوت - حماية كل الصفحات الراجعة من %s', - 'it': u'Bot: Protezione di tutte le pagine con link verso %s.', - 'pt': u'Bot: Protegendo todos os artigos afluentes a %s', -} -msg_protect_images = { - 'en': u'Robot - Protecting all images on page %s', - 'ar': u'روبوت - حماية كل الصور في الصفحة %s', - 'it': u'Bot: Protezione di tutte le immagini presenti in %s.', - 'pt': u'Bot: Protegendo todas as imagens do artigo %s', -} - -class ProtectionRobot: - """ - This robot allows protection of pages en masse. - """ - - def __init__(self, generator, summary, always = False, unprotect=False, - edit='sysop', move='sysop', create='sysop'): - """ - Arguments: - * generator - A page generator. - * always - Protect without prompting? - * edit, move, create - protection level for these operations - * unprotect - unprotect pages (and ignore edit, move, create params) - """ - self.generator = generator - self.summary = summary - self.always = always - self.unprotect = unprotect - self.edit = edit - self.move = move - - def run(self): - """ - Starts the robot's action. - """ - #Loop through everything in the page generator and (un)protect it. - for page in self.generator: - wikipedia.output(u'Processing page %s' % page.title()) - print self.edit, self.move#, self.create - page.protect(unprotect=self.unprotect, reason=self.summary, prompt=self.always, - edit=self.edit, move=self.move) - -# Asks a valid protection level for "operation". -# Returns the protection level chosen by user. -def choiceProtectionLevel(operation, default): - default = default[0] - firstChar = map(lambda level: level[0], protectionLevels) - choiceChar = wikipedia.inputChoice('Choice a protection level to %s:' % operation, - protectionLevels, firstChar, default = default) - for level in protectionLevels: - if level.startswith(choiceChar): - return level - -def main(): - global protectionLevels - protectionLevels = ['sysop', 'autoconfirmed', 'none'] - - pageName = '' - summary = '' - always = False - doSinglePage = False - doCategory = False - protectSubcategories = True - doRef = False - doLinks = False - doImages = False - fileName = '' - gen = None - edit = '' - move = '' - defaultProtection = 'sysop' - - # read command line parameters - for arg in wikipedia.handleArgs(): - if arg == '-always': - always = True - elif arg.startswith('-file'): - if len(arg) == len('-file'): - fileName = wikipedia.input(u'Enter name of file to protect pages from:') - else: - fileName = arg[len('-file:'):] - elif arg.startswith('-summary'): - if len(arg) == len('-summary'): - summary = wikipedia.input(u'Enter a reason for the protection:') - else: - summary = arg[len('-summary:'):] - elif arg.startswith('-cat'): - doCategory = True - if len(arg) == len('-cat'): - pageName = wikipedia.input(u'Enter the category to protect from:') - else: - pageName = arg[len('-cat:'):] - elif arg.startswith('-nosubcats'): - protectSubcategories = False - elif arg.startswith('-links'): - doLinks = True - if len(arg) == len('-links'): - pageName = wikipedia.input(u'Enter the page to protect from:') - else: - pageName = arg[len('-links:'):] - elif arg.startswith('-ref'): - doRef = True - if len(arg) == len('-ref'): - pageName = wikipedia.input(u'Enter the page to protect from:') - else: - pageName = arg[len('-ref:'):] - elif arg.startswith('-page'): - doSinglePage = True - if len(arg) == len('-page'): - pageName = wikipedia.input(u'Enter the page to protect:') - else: - pageName = arg[len('-page:'):] - elif arg.startswith('-images'): - doImages = True - if len(arg) == len('-images'): - pageName = wikipedia.input(u'Enter the page with the images to protect:') - else: - pageName = arg[len('-images:'):] - elif arg.startswith('-unprotect'): - defaultProtection = 'none' - elif arg.startswith('-edit'): - edit = arg[len('-edit:'):] - if edit not in protectionLevels: - edit = choiceProtectionLevel('edit', defaultProtection) - elif arg.startswith('-move'): - move = arg[len('-move:'):] - if move not in protectionLevels: - move = choiceProtectionLevel('move', defaultProtection) - elif arg.startswith('-create'): - create = arg[len('-create:'):] - if create not in protectionLevels: - create = choiceProtectionLevel('create', defaultProtection) - - mysite = wikipedia.getSite() - - if doSinglePage: - if not summary: - summary = wikipedia.input(u'Enter a reason for the protection:') - page = wikipedia.Page(mysite, pageName) - gen = iter([page]) - elif doCategory: - if not summary: - summary = wikipedia.translate(mysite, msg_protect_category) % pageName - ns = mysite.category_namespace() - categoryPage = catlib.Category(mysite, ns + ':' + pageName) - gen = pagegenerators.CategorizedPageGenerator(categoryPage, recurse = protectSubcategories) - elif doLinks: - if not summary: - summary = wikipedia.translate(mysite, msg_protect_links) % pageName - linksPage = wikipedia.Page(mysite, pageName) - gen = pagegenerators.LinkedPageGenerator(linksPage) - elif doRef: - if not summary: - summary = wikipedia.translate(mysite, msg_protect_ref) % pageName - refPage = wikipedia.Page(mysite, pageName) - gen = pagegenerators.ReferringPageGenerator(refPage) - elif fileName: - if not summary: - summary = wikipedia.translate(mysite, msg_simple_protect) - gen = pagegenerators.TextfilePageGenerator(fileName) - elif doImages: - if not summary: - summary = wikipedia.translate(mysite, msg_protect_images) % pageName - gen = pagegenerators.ImagesPageGenerator(wikipedia.Page(mysite, pageName)) - - if gen: - wikipedia.setAction(summary) - # We are just protecting pages, so we have no need of using a preloading page generator - # to actually get the text of those pages. - if not edit: edit = defaultProtection - if not move: move = defaultProtection - bot = ProtectionRobot(gen, summary, always, edit=edit, move=move) - bot.run() - else: - wikipedia.showHelp(u'protect') - -if __name__ == "__main__": - try: - main() - finally: - wikipedia.stopme() +# -*- coding: utf-8 -*- +""" +This script can be used to protect and unprotect pages en masse. +Of course, you will need an admin account on the relevant wiki. + +Syntax: python protect.py OPTION... + +Command line options: + +-page: Protect specified page +-cat: Protect all pages in the given category. +-nosubcats: Don't protect pages in the subcategories. +-links: Protect all pages linked from a given page. +-file: Protect all pages listed in a text file. +-ref: Protect all pages referring from a given page. +-images: Protect all images used on a given page. +-always: Don't prompt to protect pages, just do it. +-summary: Supply a custom edit summary. +-unprotect: Actually unprotect pages instead of protecting +-edit:PROTECTION_LEVEL Set edit protection level to PROTECTION_LEVEL +-move:PROTECTION_LEVEL Set move protection level to PROTECTION_LEVEL + +## Without support ## +## -create:PROTECTION_LEVEL Set move protection level to PROTECTION_LEVEL ## + +Values for PROTECTION_LEVEL are: sysop, autoconfirmed, none. +If an operation parameter (edit, move or create) is not specified, default +protection level is 'sysop' (or 'none' if -unprotect). + +Examples: + +Protect everything in the category "To protect" prompting. + python protect.py -cat:"To protect" -always + +Unprotect all pages listed in text file "unprotect.txt" without prompting. + python protect.py -file:unprotect.txt -unprotect +""" + +# Written by http://it.wikisource.org/wiki/Utente:Qualc1 +# Created by modifying delete.py +__version__ = '$Id: delete.py 4946 2008-01-29 14:58:25Z wikipedian $' + +# +# Distributed under the terms of the MIT license. +# + +import wikipedia, catlib +import pagegenerators + +# Summary messages for protecting from a category. +msg_simple_protect = { + 'en': u'Bot: Protecting a list of files.', + 'ar': u'بوت: حماية قائمة من الملفات.', + 'it': u'Bot: Protezione di una lista di pagine.', + 'pt': u'Bot: Protegendo uma lista de artigos.', +} +msg_protect_category = { + 'en': u'Robot - Protecting all pages from category %s', + 'ar': u'روبوت - حماية كل الصفحات من التصنيف %s', + 'it': u'Bot: Protezione di tutte le pagine nella categoria %s.', + 'pt': u'Bot: Protegendo todos os artigos da categoria %s', +} +msg_protect_links = { + 'en': u'Robot - Protecting all pages linked from %s', + 'ar': u'روبوت - حماية كل الصفحات الموصولة من %s', + 'it': u'Bot: Protezione di tutte le pagine linkate da %s.', + 'pt': u'Bot: Protegendo todos os artigos ligados a %s', +} +msg_protect_ref = { + 'en': u'Robot - Protecting all pages referring from %s', + 'ar': u'روبوت - حماية كل الصفحات الراجعة من %s', + 'it': u'Bot: Protezione di tutte le pagine con link verso %s.', + 'pt': u'Bot: Protegendo todos os artigos afluentes a %s', +} +msg_protect_images = { + 'en': u'Robot - Protecting all images on page %s', + 'ar': u'روبوت - حماية كل الصور في الصفحة %s', + 'it': u'Bot: Protezione di tutte le immagini presenti in %s.', + 'pt': u'Bot: Protegendo todas as imagens do artigo %s', +} + +class ProtectionRobot: + """ + This robot allows protection of pages en masse. + """ + + def __init__(self, generator, summary, always = False, unprotect=False, + edit='sysop', move='sysop', create='sysop'): + """ + Arguments: + * generator - A page generator. + * always - Protect without prompting? + * edit, move, create - protection level for these operations + * unprotect - unprotect pages (and ignore edit, move, create params) + """ + self.generator = generator + self.summary = summary + self.always = always + self.unprotect = unprotect + self.edit = edit + self.move = move + + def run(self): + """ + Starts the robot's action. + """ + #Loop through everything in the page generator and (un)protect it. + for page in self.generator: + wikipedia.output(u'Processing page %s' % page.title()) + print self.edit, self.move#, self.create + page.protect(unprotect=self.unprotect, reason=self.summary, prompt=self.always, + edit=self.edit, move=self.move) + +# Asks a valid protection level for "operation". +# Returns the protection level chosen by user. +def choiceProtectionLevel(operation, default): + default = default[0] + firstChar = map(lambda level: level[0], protectionLevels) + choiceChar = wikipedia.inputChoice('Choice a protection level to %s:' % operation, + protectionLevels, firstChar, default = default) + for level in protectionLevels: + if level.startswith(choiceChar): + return level + +def main(): + global protectionLevels + protectionLevels = ['sysop', 'autoconfirmed', 'none'] + + pageName = '' + summary = '' + always = False + doSinglePage = False + doCategory = False + protectSubcategories = True + doRef = False + doLinks = False + doImages = False + fileName = '' + gen = None + edit = '' + move = '' + defaultProtection = 'sysop' + + # read command line parameters + for arg in wikipedia.handleArgs(): + if arg == '-always': + always = True + elif arg.startswith('-file'): + if len(arg) == len('-file'): + fileName = wikipedia.input(u'Enter name of file to protect pages from:') + else: + fileName = arg[len('-file:'):] + elif arg.startswith('-summary'): + if len(arg) == len('-summary'): + summary = wikipedia.input(u'Enter a reason for the protection:') + else: + summary = arg[len('-summary:'):] + elif arg.startswith('-cat'): + doCategory = True + if len(arg) == len('-cat'): + pageName = wikipedia.input(u'Enter the category to protect from:') + else: + pageName = arg[len('-cat:'):] + elif arg.startswith('-nosubcats'): + protectSubcategories = False + elif arg.startswith('-links'): + doLinks = True + if len(arg) == len('-links'): + pageName = wikipedia.input(u'Enter the page to protect from:') + else: + pageName = arg[len('-links:'):] + elif arg.startswith('-ref'): + doRef = True + if len(arg) == len('-ref'): + pageName = wikipedia.input(u'Enter the page to protect from:') + else: + pageName = arg[len('-ref:'):] + elif arg.startswith('-page'): + doSinglePage = True + if len(arg) == len('-page'): + pageName = wikipedia.input(u'Enter the page to protect:') + else: + pageName = arg[len('-page:'):] + elif arg.startswith('-images'): + doImages = True + if len(arg) == len('-images'): + pageName = wikipedia.input(u'Enter the page with the images to protect:') + else: + pageName = arg[len('-images:'):] + elif arg.startswith('-unprotect'): + defaultProtection = 'none' + elif arg.startswith('-edit'): + edit = arg[len('-edit:'):] + if edit not in protectionLevels: + edit = choiceProtectionLevel('edit', defaultProtection) + elif arg.startswith('-move'): + move = arg[len('-move:'):] + if move not in protectionLevels: + move = choiceProtectionLevel('move', defaultProtection) + elif arg.startswith('-create'): + create = arg[len('-create:'):] + if create not in protectionLevels: + create = choiceProtectionLevel('create', defaultProtection) + + mysite = wikipedia.getSite() + + if doSinglePage: + if not summary: + summary = wikipedia.input(u'Enter a reason for the protection:') + page = wikipedia.Page(mysite, pageName) + gen = iter([page]) + elif doCategory: + if not summary: + summary = wikipedia.translate(mysite, msg_protect_category) % pageName + ns = mysite.category_namespace() + categoryPage = catlib.Category(mysite, ns + ':' + pageName) + gen = pagegenerators.CategorizedPageGenerator(categoryPage, recurse = protectSubcategories) + elif doLinks: + if not summary: + summary = wikipedia.translate(mysite, msg_protect_links) % pageName + linksPage = wikipedia.Page(mysite, pageName) + gen = pagegenerators.LinkedPageGenerator(linksPage) + elif doRef: + if not summary: + summary = wikipedia.translate(mysite, msg_protect_ref) % pageName + refPage = wikipedia.Page(mysite, pageName) + gen = pagegenerators.ReferringPageGenerator(refPage) + elif fileName: + if not summary: + summary = wikipedia.translate(mysite, msg_simple_protect) + gen = pagegenerators.TextfilePageGenerator(fileName) + elif doImages: + if not summary: + summary = wikipedia.translate(mysite, msg_protect_images) % pageName + gen = pagegenerators.ImagesPageGenerator(wikipedia.Page(mysite, pageName)) + + if gen: + wikipedia.setAction(summary) + # We are just protecting pages, so we have no need of using a preloading page generator + # to actually get the text of those pages. + if not edit: edit = defaultProtection + if not move: move = defaultProtection + bot = ProtectionRobot(gen, summary, always, edit=edit, move=move) + bot.run() + else: + wikipedia.showHelp(u'protect') + +if __name__ == "__main__": + try: + main() + finally: + wikipedia.stopme() Property changes on: trunk/pywikipedia/protect.py ___________________________________________________________________ Name: svn:eol-style + native

1 0

← Newer
1
...
13
14
15
16
17
18
19
20
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot July 2008