pywikibot

pywikibot@lists.wikimedia.org

3 participants
6830 discussions

[Pywikipedia-l] [ pywikipediabot-Patches-1792406 ] noreferences.py LT translation
by SourceForge.net 12 Sep '07

12 Sep '07

Patches item #1792406, was opened at 2007-09-11 17:01 Message generated for change (Comment added) made by wikipedian You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1792406&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed Resolution: Accepted Priority: 5 Private: No Submitted By: Aurimas Fischer (ebola_rulez) Assigned to: Daniel Herding (wikipedian) Summary: noreferences.py LT translation Initial Comment: Translation for lt.wiki ---------------------------------------------------------------------- >Comment By: Daniel Herding (wikipedian) Date: 2007-09-12 14:47 Message: Logged In: YES user_id=880694 Originator: NO OK then :) ---------------------------------------------------------------------- Comment By: Aurimas Fischer (ebola_rulez) Date: 2007-09-11 17:38 Message: Logged In: YES user_id=959303 Originator: YES I'm sure it is correct. We even have a template ({{Litref}}) that adds <references /> into the 'Literatūra' section. ---------------------------------------------------------------------- Comment By: Daniel Herding (wikipedian) Date: 2007-09-11 17:34 Message: Logged In: YES user_id=880694 Originator: NO I applied it, but are you sure it is correct? Do you really want to add <references/> tags into the 'Literature' section? I think 'Literatūra' should be in the placeBeforeSections dictionary, where 'Nuorodos' (external links) is. Is 'Šaltiniai' Lithuanian for 'References'? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1792406&group_…

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1791668 ] Edit to templatecount.py for being reusable by other scripts
by SourceForge.net 12 Sep '07

12 Sep '07

Patches item #1791668, was opened at 2007-09-10 17:27 Message generated for change (Settings changed) made by wikipedian You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1791668&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed >Resolution: Accepted Priority: 5 Private: No Submitted By: Pietrodn (pietrodn) Assigned to: Nobody/Anonymous (nobody) Summary: Edit to templatecount.py for being reusable by other scripts Initial Comment: I have made an edit to templatecount.py: now you can reuse the TemplateCountRobot in other scripts, because it returns valued. Before, the class' functions only printed the results to the standard output. Now, they also return the values as dictionaries. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1791668&group_…

1 0

[Pywikipedia-l] SVN: [4255] trunk/pywikipedia/templatecount.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4255 Author: wikipedian Date: 2007-09-12 12:47:01 +0000 (Wed, 12 Sep 2007) Log Message: ----------- applied patch by Pietrodn [ 1791668 ] Edit to templatecount.py for being reusable by other scripts Modified Paths: -------------- trunk/pywikipedia/templatecount.py Modified: trunk/pywikipedia/templatecount.py =================================================================== --- trunk/pywikipedia/templatecount.py 2007-09-12 12:29:22 UTC (rev 4254) +++ trunk/pywikipedia/templatecount.py 2007-09-12 12:47:01 UTC (rev 4255) @@ -39,43 +39,51 @@ mysite = wikipedia.getSite() finalText = [u'Number of transclusions per template',u'------------------------------------'] total = 0 + # The names of the templates are the keys, and the numbers of transclusions are the values. + templateDict = {} for template in templates: gen = pagegenerators.ReferringPageGenerator(wikipedia.Page(mysite, mysite.template_namespace() + ':' + template), onlyTemplateInclusion = True) if namespaces: gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) count = 0 for page in gen: - count = count + 1 + count += 1 + templateDict[template] = count finalText.append(u'%s: %d' % (template, count)) total = total + count for line in finalText: wikipedia.output(line, toStdout=True) wikipedia.output(u'TOTAL: %d' % total, toStdout=True) wikipedia.output(u'Report generated on %s' % datetime.datetime.utcnow().isoformat(), toStdout=True) + return templateDict def listTemplates(self, templates, namespaces): mysite = wikipedia.getSite() count = 0 + # The names of the templates are the keys, and lists of pages transcluding templates are the values. + templateDict = {} finalText = [u'List of pages transcluding templates:'] for template in templates: finalText.append(u'* %s' % template) finalText.append(u'------------------------------------') for template in templates: + transcludingArray = [] gen = pagegenerators.ReferringPageGenerator(wikipedia.Page(mysite, mysite.template_namespace() + ':' + template), onlyTemplateInclusion = True) if namespaces: gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) for page in gen: finalText.append(u'%s' % page.title()) - count = count + 1 + count += 1 + transcludingArray.append(page) + templateDict[template] = transcludingArray; finalText.append(u'Total page count: %d' % count) for line in finalText: wikipedia.output(line, toStdout=True) wikipedia.output(u'Report generated on %s' % datetime.datetime.utcnow().isoformat(), toStdout=True) + return templateDict def main(): - operation = "None" - doCount = False - doList = False + operation = None argsList = [] namespaces = [] @@ -92,7 +100,7 @@ else: argsList.append(arg) - if operation == "None": + if operation == None: wikipedia.output(__doc__, 'utf-8') else: robot = TemplateCountRobot()

1 0

[Pywikipedia-l] SVN: [4254] trunk/pywikipedia/wikipedia.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4254 Author: wikipedian Date: 2007-09-12 12:29:22 +0000 (Wed, 12 Sep 2007) Log Message: ----------- heavily simplified the code, using named groups etc. renamed identifiers to make it more understandable Modified Paths: -------------- trunk/pywikipedia/wikipedia.py Modified: trunk/pywikipedia/wikipedia.py =================================================================== --- trunk/pywikipedia/wikipedia.py 2007-09-12 10:54:24 UTC (rev 4253) +++ trunk/pywikipedia/wikipedia.py 2007-09-12 12:29:22 UTC (rev 4254) @@ -2042,67 +2042,71 @@ # Copyright (c) Orgullomoore, Bryan - # TODO: document and simplify the code, use understandable variable names + # TODO: document and simplify the code site = self.site() text = self.get() new_text = text - def create_regex(s): - s = re.escape(s) - return ur'(?:[%s%s]%s)' % (s[0].upper(), s[0].lower(), s[1:]) - def create_regex_i(s): + def caseInsensitivePattern(s): """ - Creates a pattern that matches the string (unescaped), case-insensitively. - Somehow an awkward way of doing this. + Creates a pattern that matches the string case-insensitively. """ + s = re.escape(s) return ur'(?:%s)' % u''.join([u'[%s%s]' % (c.upper(), c.lower()) for c in s]) - namespaces = ('Image', 'Media') + site.namespace(6, all = True) + site.namespace(-2, all = True) + def capitalizationPattern(s): + """ + Given a string, creates a pattern that matches the string, with + the first letter case-insensitive if capitalization is switched + on on the site you're working on. + """ + s = re.escape(s) + if self.site().nocapitalize: + return s + else: + return ur'(?:[%s%s]%s)' % (s[0].upper(), s[0].lower(), s[1:]) + + namespaces = set(('Image', 'Media') + site.namespace(6, all = True) + site.namespace(-2, all = True)) # note that the colon is already included here - r_namespace = ur'\s*(?:%s)\s*\:\s*' % u'|'.join(map(create_regex_i, namespaces)) - r_image = u'(%s)' % create_regex(image).replace(r'\_', '[ _]') + namespacePattern = ur'\s*(?:%s)\s*\:\s*' % u'|'.join(map(caseInsensitivePattern, namespaces)) - def simple_replacer(match, groupNumber = 1): + imagePattern = u'(%s)' % capitalizationPattern(image).replace(r'\_', '[ _]') + + def filename_replacer(match): if replacement == None: return u'' else: - groups = list(match.groups()) - groups[groupNumber] = replacement - return u''.join(groups) + old = match.group() + return old[:match.start('filename')] + replacement + old[match.end('filename'):] # The group params contains parameters such as thumb and 200px, as well # as the image caption. The caption can contain wiki links, but each # link has to be closed properly. - r_param = r'(?:\|(?:(?!\[\[).|\[\[.*?\]\])*?)' - rImage = re.compile(ur'(\[\[)(?P<namespace>%s)%s(?P<params>%s*?)(\]\])' % (r_namespace, r_image, r_param)) + paramPattern = r'(?:\|(?:(?!\[\[).|\[\[.*?\]\])*?)' + rImage = re.compile(ur'\[\[(?P<namespace>%s)(?P<filename>%s)(?P<params>%s*?)\]\]' % (namespacePattern, imagePattern, paramPattern)) + if replacement == None: + new_text = rImage.sub('', new_text) + else: + new_text = rImage.sub('[[\g<namespace>%s\g<params>]]' % replacement, new_text) - while True: - m = rImage.search(new_text) - if not m: - break - new_text = new_text[:m.start()] + simple_replacer(m, 2) + new_text[m.end():] - - # Remove the image from galleries - r_galleries = ur'(?s)(\<%s\>)(?s)(.*?)(\<\/%s\>)' % (create_regex_i('gallery'), - create_regex_i('gallery')) - r_gallery_item = ur'(?m)^((?:%s)?)%s(\s*(?:\|.*?)?\s*)$' % (r_namespace, r_image) + galleryR = re.compile(r'(?is)<gallery>(?P<items>.*?)</gallery>') + galleryItemR = re.compile(r'(?m)^%s?(?P<filename>%s)\s*(?P<label>\|.*?)?\s*$' % (namespacePattern, imagePattern)) def gallery_replacer(match): - return ur'%s%s%s' % (match.group(1), - re.sub(r_gallery_item, simple_replacer, match.group(2)), - match.group(3)) - new_text = re.sub(r_galleries, gallery_replacer, new_text) + return ur'<gallery>%s<gallery>' % galleryItemR.sub(filename_replacer, match.group('items')) + new_text = galleryR.sub(gallery_replacer, new_text) if (text == new_text) or (not safe): # All previous steps did not work, so the image is # likely embedded in a complicated template. - r_templates = ur'(?s)(\{\{.*?\}\})' - r_complicated = u'(?s)((?:%s)?)%s' % (r_namespace, r_image) + # Note: this regular expression can't handle nested templates. + templateR = re.compile(ur'(?s)\{\{(?<contents>.*?\}\}') + fileReferenceR = re.compile(u'%s(?P<filename>(?:%s)?)' % (namespacePattern, imagePattern)) def template_replacer(match): - return re.sub(r_complicated, simple_replacer, match.group(1)) - new_text = re.sub(r_templates, template_replacer, new_text) + return fileReferenceR.sub(filename_replacer, match.group('contents')) + new_text = templateR.sub(template_replacer, new_text) if put: if text != new_text:

1 0

[Pywikipedia-l] SVN: [4253] trunk/pywikipedia/weblinkchecker.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4253 Author: wikipedian Date: 2007-09-12 10:54:24 +0000 (Wed, 12 Sep 2007) Log Message: ----------- completed docu Modified Paths: -------------- trunk/pywikipedia/weblinkchecker.py Modified: trunk/pywikipedia/weblinkchecker.py =================================================================== --- trunk/pywikipedia/weblinkchecker.py 2007-09-12 10:43:37 UTC (rev 4252) +++ trunk/pywikipedia/weblinkchecker.py 2007-09-12 10:54:24 UTC (rev 4253) @@ -13,8 +13,8 @@ two times, with a time lag of at least one week. Such links will be logged to a .txt file in the deadlinks subdirectory. -After running the bot and waiting for at least one weak, you can re-check those -pages where dead links where found, using the -repeat parameter. +After running the bot and waiting for at least one week, you can re-check those +pages where dead links were found, using the -repeat parameter. In addition to the logging step, it is possible to automatically report dead links to the talk page of the article where the link was found. To use this @@ -24,13 +24,44 @@ When a link is found alive, it will be removed from the .dat file. -The following parameters are supported: +These command line parameters can be used to specify which pages to work on: &params; +-repeat Work on all pages were dead links were found before. This is + useful to confirm that the links are dead after some time (at + least one week), which is required before the script will report + the problem. + +-namespace Only process templates in the namespace with the given number or + name. This parameter may be used multiple times. + +Furthermore, the following command line parameters are supported: + +-talk Overrides the report_dead_links_on_talk config variable, enabling + the feature. + +-notalk Overrides the report_dead_links_on_talk config variable, disabling + the feature. + All other parameters will be regarded as part of the title of a single page, and the bot will only work on that single page. +The following config variables are supported: + +max_external_links - The maximum number of web pages that should be + loaded simultaneously. You should change this + according to your Internet connection speed. + Be careful: if it is set too high, the script + might get socket errors because your network + is congested, and will then think that the page + is offline. + +report_dead_links_on_talk - If set to true, causes the script to report dead + links on the article's talk page if (and ONLY if) + the linked page has been unavailable at least two + times during a timespan of at least one week. + Syntax examples: python weblinkchecker.py Loads all wiki pages in alphabetical order using the Special:Allpages

1 0

[Pywikipedia-l] SVN: [4252] trunk/pywikipedia/replace.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4252 Author: wikipedian Date: 2007-09-12 10:43:37 +0000 (Wed, 12 Sep 2007) Log Message: ----------- adapted docu indentation, as the inline pagegenerators help was changed. Modified Paths: -------------- trunk/pywikipedia/replace.py Modified: trunk/pywikipedia/replace.py =================================================================== --- trunk/pywikipedia/replace.py 2007-09-12 10:36:03 UTC (rev 4251) +++ trunk/pywikipedia/replace.py 2007-09-12 10:43:37 UTC (rev 4252) @@ -8,54 +8,54 @@ &params; - -xml Retrieve information from a local XML dump (pages-articles - or pages-meta-current, see http://download.wikimedia.org). - Argument can also be given as "-xml:filename". +-xml Retrieve information from a local XML dump (pages-articles + or pages-meta-current, see http://download.wikimedia.org). + Argument can also be given as "-xml:filename". - -page Only edit a specific page. - Argument can also be given as "-page:pagetitle". You can - give this parameter multiple times to edit multiple pages. +-page Only edit a specific page. + Argument can also be given as "-page:pagetitle". You can + give this parameter multiple times to edit multiple pages. Furthermore, the following command line parameters are supported: - -regex Make replacements using regular expressions. If this argument - isn't given, the bot will make simple text replacements. +-regex Make replacements using regular expressions. If this argument + isn't given, the bot will make simple text replacements. - -nocase Use case insensitive regular expressions. +-nocase Use case insensitive regular expressions. - -except:XYZ Ignore pages which contain XYZ. If the -regex argument is - given, XYZ will be regarded as a regular expression. +-except:XYZ Ignore pages which contain XYZ. If the -regex argument is + given, XYZ will be regarded as a regular expression. - -summary:XYZ Set the summary message text for the edit to XYZ, bypassing - the predefined message texts with original and replacements - inserted. +-summary:XYZ Set the summary message text for the edit to XYZ, bypassing + the predefined message texts with original and replacements + inserted. - -fix:XYZ Perform one of the predefined replacements tasks, which are - given in the dictionary 'fixes' defined inside the file - fixes.py. - The -regex and -nocase argument and given replacements will - be ignored if you use -fix. - Currently available predefined fixes are: +-fix:XYZ Perform one of the predefined replacements tasks, which are + given in the dictionary 'fixes' defined inside the file + fixes.py. + The -regex and -nocase argument and given replacements will + be ignored if you use -fix. + Currently available predefined fixes are: &fixes-help; - -namespace:n Number of namespace to process. The parameter can be used - multiple times. It works in combination with all other - parameters, except for the -start parameter. If you e.g. - want to iterate over all categories starting at M, use - -start:Category:M. +-namespace:n Number or name of namespace to process. The parameter can be + used multiple times. It works in combination with all other + parameters, except for the -start parameter. If you e.g. + want to iterate over all categories starting at M, use + -start:Category:M. - -always Don't prompt you for each replacement +-always Don't prompt you for each replacement - -recursive Recurse replacement until possible. Be careful, this might - lead to an infinite loop. +-recursive Recurse replacement as long as possible. Be careful, this + might lead to an infinite loop. - -allowoverlap When occurences of the pattern overlap, replace all of them. - Be careful, this might lead to an infinite loop. +-allowoverlap When occurences of the pattern overlap, replace all of them. + Be careful, this might lead to an infinite loop. - other: First argument is the old text, second argument is the new text. - If the -regex argument is given, the first argument will be - regarded as a regular expression, and the second argument might - contain expressions like \\1 or \g<name>. +other: First argument is the old text, second argument is the new text. + If the -regex argument is given, the first argument will be + regarded as a regular expression, and the second argument might + contain expressions like \\1 or \g<name>. Examples:

1 0

[Pywikipedia-l] SVN: [4251] trunk/pywikipedia
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4251 Author: wikipedian Date: 2007-09-12 10:36:03 +0000 (Wed, 12 Sep 2007) Log Message: ----------- usability: scripts now not only allow -namespace:4, but also e.g. -namespace:Wikip?\195?\169dia on fr:, -namespace:Wikipedia on all Wikipedias, and -namespace:Project on all wikis. Modified Paths: -------------- trunk/pywikipedia/capitalize_redirects.py trunk/pywikipedia/copyright.py trunk/pywikipedia/family.py trunk/pywikipedia/noreferences.py trunk/pywikipedia/pagegenerators.py trunk/pywikipedia/redirect.py trunk/pywikipedia/refcheck.py trunk/pywikipedia/replace.py trunk/pywikipedia/selflink.py trunk/pywikipedia/standardize_notes.py trunk/pywikipedia/template.py trunk/pywikipedia/templatecount.py trunk/pywikipedia/unlink.py trunk/pywikipedia/weblinkchecker.py Modified: trunk/pywikipedia/capitalize_redirects.py =================================================================== --- trunk/pywikipedia/capitalize_redirects.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/capitalize_redirects.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -19,12 +19,13 @@ -start Work on all pages on the home wiki, starting at the named page. - + -page Work on a single page. -namespace Run over especific namespace. - Argument can also be given as "-namespace:100". - + Argument can also be given as "-namespace:100" or + "-namespace:Image". + -always Don't prompt to make changes, just do them. Example: "python capitalize_redirects.py -start:B -always" @@ -129,7 +130,10 @@ elif arg == '-always': acceptall = True elif arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) else: commandline_replacements.append(arg) Modified: trunk/pywikipedia/copyright.py =================================================================== --- trunk/pywikipedia/copyright.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/copyright.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -47,7 +47,7 @@ -links - Work on all pages that are linked to from a certain page. Argument can also be given as "-links:linkingpagetitle". -start - Work on all pages in the wiki, starting at a given page. --namespace:n - Number of namespace to process. The parameter can be used +-namespace:n - Number or name of namespace to process. The parameter can be used multiple times. Examples: @@ -961,7 +961,10 @@ else: PageTitles.append(arg[6:]) elif arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) elif arg.startswith('-forceupdate'): load_pages(force_update = True) elif arg == '-repeat': Modified: trunk/pywikipedia/family.py =================================================================== --- trunk/pywikipedia/family.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/family.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -2239,18 +2239,18 @@ v = self.namespaces[ns_number][fallback] else: raise KeyError('ERROR: title for namespace %d in language %s unknown' % (ns_number, code)) - - if all: - if type(v) == type([]): - return tuple(v) - else: - return (v, ) - else: - if type(v) == type([]): - return v[0] - else: - return v + if all: + if type(v) == type([]): + return tuple(v) + else: + return (v, ) + else: + if type(v) == type([]): + return v[0] + else: + return v + def isDefinedNS(self, ns_number): """Return True if the namespace has been defined in this family. """ Modified: trunk/pywikipedia/noreferences.py =================================================================== --- trunk/pywikipedia/noreferences.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/noreferences.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -14,8 +14,8 @@ or pages-meta-current, see http://download.wikimedia.org). Argument can also be given as "-xml:filename". - -namespace:n Number of namespace to process. The parameter can be used - multiple times. It works in combination with all other + -namespace:n Number or name of namespace to process. The parameter can be + used multiple times. It works in combination with all other parameters, except for the -start parameter. If you e.g. want to iterate over all categories starting at M, use -start:Category:M. @@ -306,7 +306,10 @@ xmlFilename = arg[5:] gen = XmlDumpNoReferencesPageGenerator(xmlFilename) elif arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) elif arg == '-always': always = True else: Modified: trunk/pywikipedia/pagegenerators.py =================================================================== --- trunk/pywikipedia/pagegenerators.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/pagegenerators.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -417,9 +417,20 @@ def NamespaceFilterPageGenerator(generator, namespaces): """ - Wraps around another generator. Yields only those pages that are in a list - of specific namespace. + Wraps around another generator. Yields only those pages that are in one + of the given namespaces. + + The namespace list can contain both integers (namespace numbers) and + strings/unicode strings (namespace names). """ + # convert namespace names to namespace numbers + for i in xrange(len(namespaces)): + ns = namespaces[i] + if isinstance(ns, unicode) or isinstance(ns, str): + index = wikipedia.getSite().getNamespaceIndex(ns) + if index is None: + raise ValueError(u'Unknown namespace: %s' % ns) + namespaces[i] = index for page in generator: if page.namespace() in namespaces: yield page Modified: trunk/pywikipedia/redirect.py =================================================================== --- trunk/pywikipedia/redirect.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/redirect.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -292,7 +292,10 @@ else: xmlFilename = arg[5:] elif arg.startswith('-namespace:'): - namespace = int(arg[11:]) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) elif arg.startswith('-restart:'): restart = int(arg[9:]) else: Modified: trunk/pywikipedia/refcheck.py =================================================================== --- trunk/pywikipedia/refcheck.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/refcheck.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -1,7 +1,11 @@ -""" +""" This script checks references to see if they are properly formatted. Right now it just counts the total number of transclusions of any number of given templates. +NOTE: This script is not capable of handling the <ref></ref> syntax. It just +handles the {{ref}} syntax, which is still used, but DEPRECATED on the English +Wikipedia. + Syntax: python refcheck.py command [arguments] Command line options: @@ -50,7 +54,10 @@ if arg == '-count': doCount = True elif arg.startswith('-namespace:'): - namespaces.append(int(arg[len('-namespace:'):])) + try: + namespaces.append(int(arg[len('-namespace:'):])) + except ValueError: + namespaces.append(arg[len('-namespace:'):]) else: argsList.append(arg) Modified: trunk/pywikipedia/replace.py =================================================================== --- trunk/pywikipedia/replace.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/replace.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -351,7 +351,10 @@ elif arg == '-nocase': caseInsensitive = True elif arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) elif arg.startswith('-summary:'): wikipedia.setAction(arg[9:]) summary_commandline = True Modified: trunk/pywikipedia/selflink.py =================================================================== --- trunk/pywikipedia/selflink.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/selflink.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -212,7 +212,10 @@ LIMIT 100""" gen = pagegenerators.MySQLPageGenerator(query) elif arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) else: generator = genFactory.handleArg(arg) if generator: Modified: trunk/pywikipedia/standardize_notes.py =================================================================== --- trunk/pywikipedia/standardize_notes.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/standardize_notes.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -6,6 +6,10 @@ At present it converts to [[Wikipedia:Footnote3]] format (ref/note). +NOTE: This script is not capable of handling the <ref></ref> syntax. It just +handles the {{ref}} syntax, which is still used, but DEPRECATED on the English +Wikipedia. + You can run the bot with the following commandline parameters: -file - Work on all pages given in a local text file. @@ -1062,7 +1066,10 @@ elif arg == '-always': acceptall = True elif arg.startswith('-namespace:'): - namespace = int(arg[11:]) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) else: commandline_replacements.append(arg) Modified: trunk/pywikipedia/template.py =================================================================== --- trunk/pywikipedia/template.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/template.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -333,7 +333,10 @@ else: xmlfilename = arg[5:] elif arg.startswith('-namespace:'): - namespaces.append(int(arg[len('-namespace:'):])) + try: + namespaces.append(int(arg[len('-namespace:'):])) + except ValueError: + namespaces.append(arg[len('-namespace:'):]) elif arg.startswith('-category:'): addedCat = arg[len('-category:'):] elif arg.startswith('-summary:'): Modified: trunk/pywikipedia/templatecount.py =================================================================== --- trunk/pywikipedia/templatecount.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/templatecount.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -1,4 +1,4 @@ -""" +""" This script will display the list of pages transcluding a given list of templates. It can also be used to simply count the number of pages (rather than listing each individually). @@ -33,75 +33,78 @@ import datetime class TemplateCountRobot: - #def __init__(self): - #Nothing - def countTemplates(self, templates, namespaces): - mysite = wikipedia.getSite() - finalText = [u'Number of transclusions per template',u'------------------------------------'] - total = 0 - for template in templates: - gen = pagegenerators.ReferringPageGenerator(wikipedia.Page(mysite, mysite.template_namespace() + ':' + template), onlyTemplateInclusion = True) - if namespaces: - gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) - count = 0 - for page in gen: - count = count + 1 - finalText.append(u'%s: %d' % (template, count)) - total = total + count - for line in finalText: - wikipedia.output(line, toStdout=True) - wikipedia.output(u'TOTAL: %d' % total, toStdout=True) - wikipedia.output(u'Report generated on %s' % datetime.datetime.utcnow().isoformat(), toStdout=True) + #def __init__(self): + #Nothing + def countTemplates(self, templates, namespaces): + mysite = wikipedia.getSite() + finalText = [u'Number of transclusions per template',u'------------------------------------'] + total = 0 + for template in templates: + gen = pagegenerators.ReferringPageGenerator(wikipedia.Page(mysite, mysite.template_namespace() + ':' + template), onlyTemplateInclusion = True) + if namespaces: + gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) + count = 0 + for page in gen: + count = count + 1 + finalText.append(u'%s: %d' % (template, count)) + total = total + count + for line in finalText: + wikipedia.output(line, toStdout=True) + wikipedia.output(u'TOTAL: %d' % total, toStdout=True) + wikipedia.output(u'Report generated on %s' % datetime.datetime.utcnow().isoformat(), toStdout=True) - def listTemplates(self, templates, namespaces): - mysite = wikipedia.getSite() - count = 0 - finalText = [u'List of pages transcluding templates:'] - for template in templates: - finalText.append(u'* %s' % template) - finalText.append(u'------------------------------------') - for template in templates: - gen = pagegenerators.ReferringPageGenerator(wikipedia.Page(mysite, mysite.template_namespace() + ':' + template), onlyTemplateInclusion = True) - if namespaces: - gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) - for page in gen: - finalText.append(u'%s' % page.title()) - count = count + 1 - finalText.append(u'Total page count: %d' % count) - for line in finalText: - wikipedia.output(line, toStdout=True) - wikipedia.output(u'Report generated on %s' % datetime.datetime.utcnow().isoformat(), toStdout=True) + def listTemplates(self, templates, namespaces): + mysite = wikipedia.getSite() + count = 0 + finalText = [u'List of pages transcluding templates:'] + for template in templates: + finalText.append(u'* %s' % template) + finalText.append(u'------------------------------------') + for template in templates: + gen = pagegenerators.ReferringPageGenerator(wikipedia.Page(mysite, mysite.template_namespace() + ':' + template), onlyTemplateInclusion = True) + if namespaces: + gen = pagegenerators.NamespaceFilterPageGenerator(gen, namespaces) + for page in gen: + finalText.append(u'%s' % page.title()) + count = count + 1 + finalText.append(u'Total page count: %d' % count) + for line in finalText: + wikipedia.output(line, toStdout=True) + wikipedia.output(u'Report generated on %s' % datetime.datetime.utcnow().isoformat(), toStdout=True) def main(): - operation = "None" - doCount = False - doList = False - argsList = [] - namespaces = [] + operation = "None" + doCount = False + doList = False + argsList = [] + namespaces = [] - for arg in wikipedia.handleArgs(): - if arg == '-count': - operation = "Count" - elif arg == '-list': - operation = "List" - elif arg.startswith('-namespace:'): - namespaces.append(int(arg[len('-namespace:'):])) - else: - argsList.append(arg) + for arg in wikipedia.handleArgs(): + if arg == '-count': + operation = "Count" + elif arg == '-list': + operation = "List" + elif arg.startswith('-namespace:'): + try: + namespaces.append(int(arg[len('-namespace:'):])) + except ValueError: + namespaces.append(arg[len('-namespace:'):]) + else: + argsList.append(arg) - if operation == "None": - wikipedia.output(__doc__, 'utf-8') - else: - robot = TemplateCountRobot() - if not argsList: - argsList = ['ref', 'note', 'ref label', 'note label'] - if operation == "Count": - robot.countTemplates(argsList, namespaces) - elif operation == "List": - robot.listTemplates(argsList, namespaces) + if operation == "None": + wikipedia.output(__doc__, 'utf-8') + else: + robot = TemplateCountRobot() + if not argsList: + argsList = ['ref', 'note', 'ref label', 'note label'] + if operation == "Count": + robot.countTemplates(argsList, namespaces) + elif operation == "List": + robot.listTemplates(argsList, namespaces) if __name__ == "__main__": - try: - main() - finally: - wikipedia.stopme() + try: + main() + finally: + wikipedia.stopme() Modified: trunk/pywikipedia/unlink.py =================================================================== --- trunk/pywikipedia/unlink.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/unlink.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -145,7 +145,10 @@ for arg in wikipedia.handleArgs(): if arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) else: pageTitleParts.append(arg) Modified: trunk/pywikipedia/weblinkchecker.py =================================================================== --- trunk/pywikipedia/weblinkchecker.py 2007-09-12 10:07:51 UTC (rev 4250) +++ trunk/pywikipedia/weblinkchecker.py 2007-09-12 10:36:03 UTC (rev 4251) @@ -669,7 +669,10 @@ elif arg == '-notalk': config.report_dead_links_on_talk = False elif arg.startswith('-namespace:'): - namespaces.append(int(arg[11:])) + try: + namespaces.append(int(arg[11:])) + except ValueError: + namespaces.append(arg[11:]) elif arg == '-repeat': gen = RepeatPageGenerator() else:

1 0

[Pywikipedia-l] SVN: [4250] trunk/pywikipedia/family.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4250 Author: wikipedian Date: 2007-09-12 10:07:51 +0000 (Wed, 12 Sep 2007) Log Message: ----------- removed method that is no longer useful Modified Paths: -------------- trunk/pywikipedia/family.py Modified: trunk/pywikipedia/family.py =================================================================== --- trunk/pywikipedia/family.py 2007-09-12 10:04:33 UTC (rev 4249) +++ trunk/pywikipedia/family.py 2007-09-12 10:07:51 UTC (rev 4250) @@ -2218,18 +2218,10 @@ """Add a new language to the langs and namespaces of the family. This is supposed to be called in the constructor of the family.""" self.langs[code] = location - + for num, val in namespaces.items(): self.namespaces[num][code]=val - def _talkNamespace(self, code, associatedNamespaceIndex): - associatedNamespace = self.namespace(code, associatedNamespaceIndex) - if self.talkNamespacePatterns.has_key(code): - talk = self.talkNamespacePatterns[code] - else: - talk = self.talkNamespacePatterns['_default'] - return talk(associatedNamespace) - def linktrail(self, code, fallback = '_default'): if self.linktrails.has_key(code): return self.linktrails[code]

1 0

[Pywikipedia-l] SVN: [4249] trunk/pywikipedia/template.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4249 Author: wikipedian Date: 2007-09-12 10:04:33 +0000 (Wed, 12 Sep 2007) Log Message: ----------- bugfix Modified Paths: -------------- trunk/pywikipedia/template.py Modified: trunk/pywikipedia/template.py =================================================================== --- trunk/pywikipedia/template.py 2007-09-12 10:00:35 UTC (rev 4248) +++ trunk/pywikipedia/template.py 2007-09-12 10:04:33 UTC (rev 4249) @@ -373,8 +373,6 @@ else: gens = [] gens = [pagegenerators.ReferringPageGenerator(t, onlyTemplateInclusion = True) for t in oldTemplates] - singleGen = - gens.append(singleGen) gen = pagegenerators.CombinedPageGenerator(gens) gen = pagegenerators.DuplicateFilterPageGenerator(gen)

1 0

[Pywikipedia-l] SVN: [4248] trunk/pywikipedia/template.py
by wikipedian＠svn.wikimedia.org 12 Sep '07

12 Sep '07

Revision: 4248 Author: wikipedian Date: 2007-09-12 10:00:35 +0000 (Wed, 12 Sep 2007) Log Message: ----------- made XmlDumpTemplatePageGenerator usable with multiple templates Modified Paths: -------------- trunk/pywikipedia/template.py Modified: trunk/pywikipedia/template.py =================================================================== --- trunk/pywikipedia/template.py 2007-09-12 09:49:46 UTC (rev 4247) +++ trunk/pywikipedia/template.py 2007-09-12 10:00:35 UTC (rev 4248) @@ -103,14 +103,15 @@ template. These pages will be retrieved from a local XML dump file (cur table). """ - def __init__(self, template, xmlfilename): + def __init__(self, templates, xmlfilename): """ Arguments: - * template - A Page object representing the searched template - * xmlfilename - The dump's path, either absolute or relative + * templateNames - A list of Page object representing the searched + templates + * xmlfilename - The dump's path, either absolute or relative """ - self.template = template - wikipedia.Page(mysite, ns + ':' + thisPage) + + self.templates = templates self.xmlfilename = xmlfilename def __iter__(self): @@ -124,14 +125,15 @@ # {{vfd}} does the same thing as {{Vfd}}, so both will be found. # The old syntax, {{msg:vfd}}, will also be found. # TODO: check site.nocapitalize() - templateName = self.template.titleWithoutNamespace() - if wikipedia.getSite().nocapitalize: - # FIXME - old = self.old - else: - templateName = '[' + templateName[0].upper() + templateName[0].lower() + ']' + templateName[1:] - templateName = re.sub(' ', '[_ ]', templateName) - templateRegex = re.compile(r'\{\{ *([mM][sS][gG]:)?' + templateName + ' *(?P<parameters>\|[^}]+|) *}}') + templatePatterns = [] + for template in self.templates: + templatePattern = template.titleWithoutNamespace() + if not wikipedia.getSite().nocapitalize: + templatePattern = '[' + templatePattern[0].upper() + templatePattern[0].lower() + ']' + templatePattern[1:] + templatePattern = re.sub(' ', '[_ ]', templatePattern) + templatePatterns.append(templatePattern) + templateRegex = re.compile(r'\{\{ *([mM][sS][gG]:)?(?:%s) *(?P<parameters>\|[^}]+|) *}}' % '|'.join(templatePatterns)) + for entry in dump.parse(): if templateRegex.search(entry.text): page = wikipedia.Page(mysite, entry.title) @@ -357,17 +359,21 @@ wikipedia.output(u'Unless using -subst or -remove, you must give an even number of template names.') return + oldTemplates = [] + ns = wikipedia.getSite().template_namespace() + for templateName in templates.keys(): + oldTemplate = wikipedia.Page(wikipedia.getSite(), ns + ':' + templateName) + oldTemplates.append(oldTemplate) + if xmlfilename: - gen = XmlDumpTemplatePageGenerator(templates.keys(), xmlfilename) + gen = XmlDumpTemplatePageGenerator(oldTemplates, xmlfilename) elif pageTitles: pages = [wikipedia.Page(wikipedia.getSite(), pageTitle) for pageTitle in pageTitles] gen = iter(pages) else: gens = [] - ns = wikipedia.getSite().template_namespace() - for templateName in templates.keys(): - template = wikipedia.Page(wikipedia.getSite(), ns + ':' + templateName) - singleGen = pagegenerators.ReferringPageGenerator(template, onlyTemplateInclusion = True) + gens = [pagegenerators.ReferringPageGenerator(t, onlyTemplateInclusion = True) for t in oldTemplates] + singleGen = gens.append(singleGen) gen = pagegenerators.CombinedPageGenerator(gens) gen = pagegenerators.DuplicateFilterPageGenerator(gen)

1 0

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot