pywikibot November 2007

pywikibot@lists.wikimedia.org

29 participants
285 discussions

[Pywikipedia-l] [ pywikipediabot-Patches-1828962 ] interwiki.py - Uzbek language summary translation
by SourceForge.net 09 Nov '07

09 Nov '07

Patches item #1828962, was opened at 2007-11-09 13:30 Message generated for change (Comment added) made by filnik You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1828962&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alleborgo (alleborgo) Assigned to: Nobody/Anonymous (nobody) Summary: interwiki.py - Uzbek language summary translation Initial Comment: patch to add edit summary translations for Uzbek language. ---------------------------------------------------------------------- Comment By: Filnik (filnik) Date: 2007-11-09 16:20 Message: Logged In: YES user_id=1834469 Originator: NO Patch committed. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1828962&group_…

1 0

[Pywikipedia-l] SVN: [4522] trunk/pywikipedia/interwiki.py
by filnik＠svn.wikimedia.org 09 Nov '07

09 Nov '07

Revision: 4522 Author: filnik Date: 2007-11-09 16:18:01 +0000 (Fri, 09 Nov 2007) Log Message: ----------- Adding Uzbek language summary translation, Patch 1828962 committed by Alleborgo Modified Paths: -------------- trunk/pywikipedia/interwiki.py Modified: trunk/pywikipedia/interwiki.py =================================================================== --- trunk/pywikipedia/interwiki.py 2007-11-09 16:09:36 UTC (rev 4521) +++ trunk/pywikipedia/interwiki.py 2007-11-09 16:18:01 UTC (rev 4522) @@ -333,6 +333,7 @@ 'to': (u'mīsini', u'ʻoku tānaki', u'ʻoku toʻo', u'ʻoku liliu'), 'tr': (u'Bot değişikliği', u'Ekleniyor', u'Kaldırılıyor', u'Değiştiriliyor'), 'uk': (u'робот', u'додав', u'видалив', u'змінив'), + 'uz': (u'Bot', u'Qoʻshdi', u'Tuzatdi', u'Oʻchirdi'), 'vi': (u'robot ', u'Thêm', u'Dời', u'Thay'), 'vo': (u'bot ', u'läükon', u'moükon', u'votükon'), 'yue': (u'機械人 ', u'加', u'減', u'改'),

1 0

[Pywikipedia-l] SVN: [4521] trunk/pywikipedia/pageimport.py
by filnik＠svn.wikimedia.org 09 Nov '07

09 Nov '07

Revision: 4521 Author: filnik Date: 2007-11-09 16:09:36 +0000 (Fri, 09 Nov 2007) Log Message: ----------- Some fixes to make it work better (tested, working perfectly) Modified Paths: -------------- trunk/pywikipedia/pageimport.py Modified: trunk/pywikipedia/pageimport.py =================================================================== --- trunk/pywikipedia/pageimport.py 2007-11-09 15:28:31 UTC (rev 4520) +++ trunk/pywikipedia/pageimport.py 2007-11-09 16:09:36 UTC (rev 4521) @@ -4,6 +4,20 @@ This is a script to import pages from a certain wiki to another. This requires administrator privileges. + +Here there is an example of how to use it: + +from pageimport import * +def main(): + # Defing what page to load.. + pagetoload = 'Apple' + site = wikipedia.getSite() + importerbot = Importer(site) # Inizializing + importerbot.Import(pagetoload, prompt = True) +try: + main() +finally: + wikipedia.stopme() """ # # (C) Filnik, 2007 @@ -16,46 +30,35 @@ __version__ = '$Id: pageimport.py 4336 2007-09-20 14:51:04Z wikipedian $' -import wikipedia - -# Global variables -site = wikipedia.getSite() -# ################################################################ # -def main(): - # Defing what page to load.. - pagetoload = 'Apple' - importerbot = Importer(site) # Inizializing - importerbot.Import(pagetoload, prompt = True) - -""" -**************************************************************************** -System functions follow: no changes should be necessary! -**************************************************************************** -""" - import urllib -import login, config +import wikipedia, login, config class Importer(wikipedia.Page): def __init__(self, site): + self.importsite = site wikipedia.Page.__init__(self, site, 'Special:Import', None, 0) def Import(self, target, project = 'w', crono = '1', namespace = '', prompt = True): """Import the page from the wiki. Requires administrator status. If prompt is True, asks the user if he wants to delete the page. """ + if project == 'w': + site = wikipedia.getSite(fam = 'wikipedia') + elif project == 'b': + site = wikipedia.getSite(fam = 'wikibooks') + elif project == 'wikt': + site = wikipedia.getSite(fam = 'wiktionary') + elif project == 's': + site = wikipedia.getSite(fam = 'wikisource') + elif project == 'q': + site = wikipedia.getSite(fam = 'wikiquote') + else: + site = wikipedia.getSite() # Fixing the crono value... if crono == True: crono = '1' elif crono == False: crono = '0' - elif crono == '0': - pass - elif crono == '1': - pass - else: - wikipedia.output(u'Crono value set wrongly.') - wikipedia.stopme() # Fixing namespace's value. if namespace == '0': namespace == '' @@ -92,14 +95,12 @@ response, data = self.site().postForm(address, predata, sysop = True) if data: wikipedia.output(u'Page imported, checking...') - if wikipedia.Page(site, target).exists(): + if wikipedia.Page(self.importsite, target).exists(): wikipedia.output(u'Import success!') return True else: wikipedia.output(u'Import failed!') return False if __name__=='__main__': - try: - main() - finally: - wikipedia.stopme() \ No newline at end of file + wikipedia.output(u'This is just a module! Read the documentation and write your own script!') + wikipedia.stopme()

1 0

[Pywikipedia-l] SVN: [4520] trunk/pywikipedia/welcome.py
by filnik＠svn.wikimedia.org 09 Nov '07

09 Nov '07

Revision: 4520 Author: filnik Date: 2007-11-09 15:28:31 +0000 (Fri, 09 Nov 2007) Log Message: ----------- Some fixes, deleting the project-checker so the bot will use en config if no one is specified Modified Paths: -------------- trunk/pywikipedia/welcome.py Modified: trunk/pywikipedia/welcome.py =================================================================== --- trunk/pywikipedia/welcome.py 2007-11-09 11:40:26 UTC (rev 4519) +++ trunk/pywikipedia/welcome.py 2007-11-09 15:28:31 UTC (rev 4520) @@ -20,6 +20,7 @@ * Italian Wikipedia: http://it.wikipedia.org/wiki/Wikipedia:Benvenuto_log * English Wikiquote: http://en.wikiquote.org/wiki/Wikiquote:Welcome_log * Persian Wikipedia: http://fa.wikipedia.org/wiki/ویکی‌پدیا:سیاهه خوشامد +* Korean Wikipedia: http://ko.wikipedia.org/wiki/위키백과:Welcome log Everything that needs customisation to support additional projects is indicated by comments. @@ -138,9 +139,7 @@ use them. *************************** Known issues/FIXMEs **************************** -* use default pages if a wiki is not configured, so no configuration of - the script would be required at all. Suggestion: use English language - defaults. + * The regex to load the user might be slightly different from project to project. (in this case, write to Filnik for help...) * Understand if it's the case to use a class to group toghether the functions used. @@ -300,7 +299,7 @@ 'fa': u'Project:سیاهه خوشامد/امضاها', 'en': u'User:Filnik/Sign', 'it': u'Project:Benvenuto log/User', - 'zh': u'user:Welcomebot/欢迎日志/用户', + 'zh': u'user:Welcomebot/欢迎日志/用户', } # The page where the bot reads the real-time whitelist page. # (this parameter is optional). @@ -309,8 +308,6 @@ 'en':u'User:Filnik/whitelist', 'it':u'Utente:Filbot/whitelist', } -# Add your project (in alphabetical order) if you want that the bot start. -project_inserted = ['ar', 'commons', 'de', 'en', 'fa', 'it', 'nl', 'no', 'sq','zh'] # Ok, that's all. What is below, is the rest of code, now the code is fixed # and it will run correctly in your project ;) @@ -321,9 +318,6 @@ class FilenameNotSet(wikipedia.Error): """An exception indicating that a signature filename was not specifed.""" -class NoProjectFound(wikipedia.Error): - """An exception indicating that the Bot can't find the Project's parameters.""" - # Function stolen from wikipedia.py and modified. def urlname(talk_page, site): """The name of the page this Page refers to, in a form suitable for the URL of the page.""" @@ -597,10 +591,6 @@ # The site wsite = wikipedia.getSite() - # A little block-statement to ensure that the bot won't start with en-parameters. - if wsite.lang not in project_inserted: - raise NoProjectFound(u'Your project is not supported by the framework. You have to edit the script and add it!') - # The follow lines translate the language's parameters. welcomer = wikipedia.translate(wsite, netext) summ = wikipedia.translate(wsite, summary)

1 0

[Pywikipedia-l] [ pywikipediabot-Patches-1828962 ] interwiki.py - Uzbek language summary translation
by SourceForge.net 09 Nov '07

09 Nov '07

Patches item #1828962, was opened at 2007-11-09 14:30 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1828962&group_… Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Alleborgo (alleborgo) Assigned to: Nobody/Anonymous (nobody) Summary: interwiki.py - Uzbek language summary translation Initial Comment: patch to add edit summary translations for Uzbek language. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1828962&group_…

1 0

[Pywikipedia-l] SVN: [4519] trunk/pywikipedia
by cosoleto＠svn.wikimedia.org 09 Nov '07

09 Nov '07

Revision: 4519 Author: cosoleto Date: 2007-11-09 11:40:26 +0000 (Fri, 09 Nov 2007) Log Message: ----------- fixed line ending style Modified Paths: -------------- trunk/pywikipedia/copyright_clean.py trunk/pywikipedia/copyright_put.py Modified: trunk/pywikipedia/copyright_clean.py =================================================================== --- trunk/pywikipedia/copyright_clean.py 2007-11-09 11:34:57 UTC (rev 4518) +++ trunk/pywikipedia/copyright_clean.py 2007-11-09 11:40:26 UTC (rev 4519) @@ -1,159 +1,159 @@ -# -*- coding: utf-8 -*- -""" -""" - -# -# (C) Francesco Cosoleto, 2006 -# -# Distributed under the terms of the MIT license. -# - -import httplib, socket, simplejson, re, time -import config, wikipedia, catlib, pagegenerators, query - -from urllib import urlencode -from copyright import mysplit, put, reports_cat - -import sys - -summary_msg = { - 'en': u'Removing', - 'it': u'Rimozione', -} - -headC = re.compile("(?m)^=== (?:<strike>)?(?:<s>)?(?:<del>)?\[\[(?::)?(.*?)\]\]") -separatorC = re.compile('(?m)^== +') -next_headC = re.compile("(?m)^=+.*?=+") - -# -# {{botbox|title|newid|oldid|author|...}} -rev_templateC = re.compile("(?m)^(?:{{/t\|.*?}}\n?)?{{(?:/box|botbox)\|.*?\|(.*?)\|") - -def query_yurik_api(data): - - predata = [ - ('format', 'json'), - ('what', 'revisions'), - ('rvlimit', '1'), - data] - - data = urlencode(predata) - host = wikipedia.getSite().hostname() - address = wikipedia.getSite().query_address() - conn = httplib.HTTPConnection(host) - conn.request("GET", address + data) - response = conn.getresponse() - data = response.read() - conn.close() - - return data - -def page_exist(title): - for pageobjs in query_results_titles: - for key in pageobjs['pages']: - if pageobjs['pages'][key]['title'] == title: - if int(key) >= 0: - return True - wikipedia.output('* ' + title) - return False - -def revid_exist(revid): - for pageobjs in query_results_revids: - for id in pageobjs['pages']: - for rv in range(len(pageobjs['pages'][id]['revisions'])): - if pageobjs['pages'][id]['revisions'][rv]['revid'] == int(revid): - # print rv - return True - wikipedia.output('* ' + revid) - return False - -cat = catlib.Category(wikipedia.getSite(), 'Category:%s' % wikipedia.translate(wikipedia.getSite(), reports_cat)) -gen = pagegenerators.CategorizedPageGenerator(cat, recurse = True) - -for page in gen: - data = page.get() - wikipedia.output(page.aslink()) - output = '' - - # - # Preserve text before of the sections - # - - m = re.search("(?m)^==\s*[^=]*?\s*==", data) - if m: - output = data[:m.end() + 1] - else: - m = re.search("(?m)^===\s*[^=]*?", data) - if not m: - continue - output = data[:m.start()] - - titles = headC.findall(data) - revids = rev_templateC.findall(data) - - query_results_titles = list() - query_results_revids = list() - - # No more of 100 titles at a time using Yurik's API - for s in mysplit(query.ListToParam(titles), 100, "|"): - query_results_titles.append(simplejson.loads(query_yurik_api(('titles', s)))) - for s in mysplit(query.ListToParam(revids), 100, "|"): - query_results_revids.append(simplejson.loads(query_yurik_api(('revids', s)))) - - comment_entry = list() - add_separator = False - index = 0 - - while True: - head = headC.search(data, index) - if not head: - break - index = head.end() - title = head.group(1) - next_head = next_headC.search(data, index) - if next_head: - if separatorC.search(data[next_head.start():next_head.end()]): - add_separator = True - stop = next_head.start() - else: - stop = len(data) - - exist = True - if page_exist(title): - # check {{botbox}} - revid = re.search("{{(?:/box|botbox)\|.*?\|(.*?)\|", data[head.end():stop]) - if revid: - if not revid_exist(revid.group(1)): - exist = False - else: - exist = False - - if exist: - output += "=== [[" + title + "]]" + data[head.end():stop] - else: - comment_entry.append("[[%s]]" % title) - - if add_separator: - output += data[next_head.start():next_head.end()] + '\n' - add_separator = False - - add_comment = u'%s: %s' % (wikipedia.translate(wikipedia.getSite(), summary_msg),", ".join(comment_entry)) - - # remove useless newlines - output = re.sub("(?m)^\n", "", output) - - if comment_entry: - wikipedia.output(add_comment) - if wikipedia.verbose: - wikipedia.showDiff(page.get(), output) - - if len(sys.argv)!=1: - choice = wikipedia.inputChoice(u'Do you want to clean the page?', ['Yes', 'No'], ['y', 'n'], 'n') - if choice in ['n', 'N']: - continue - try: - put(page, output, add_comment) - except wikipedia.PageNotSaved: - raise - -wikipedia.stopme() +# -*- coding: utf-8 -*- +""" +""" + +# +# (C) Francesco Cosoleto, 2006 +# +# Distributed under the terms of the MIT license. +# + +import httplib, socket, simplejson, re, time +import config, wikipedia, catlib, pagegenerators, query + +from urllib import urlencode +from copyright import mysplit, put, reports_cat + +import sys + +summary_msg = { + 'en': u'Removing', + 'it': u'Rimozione', +} + +headC = re.compile("(?m)^=== (?:<strike>)?(?:<s>)?(?:<del>)?\[\[(?::)?(.*?)\]\]") +separatorC = re.compile('(?m)^== +') +next_headC = re.compile("(?m)^=+.*?=+") + +# +# {{botbox|title|newid|oldid|author|...}} +rev_templateC = re.compile("(?m)^(?:{{/t\|.*?}}\n?)?{{(?:/box|botbox)\|.*?\|(.*?)\|") + +def query_yurik_api(data): + + predata = [ + ('format', 'json'), + ('what', 'revisions'), + ('rvlimit', '1'), + data] + + data = urlencode(predata) + host = wikipedia.getSite().hostname() + address = wikipedia.getSite().query_address() + conn = httplib.HTTPConnection(host) + conn.request("GET", address + data) + response = conn.getresponse() + data = response.read() + conn.close() + + return data + +def page_exist(title): + for pageobjs in query_results_titles: + for key in pageobjs['pages']: + if pageobjs['pages'][key]['title'] == title: + if int(key) >= 0: + return True + wikipedia.output('* ' + title) + return False + +def revid_exist(revid): + for pageobjs in query_results_revids: + for id in pageobjs['pages']: + for rv in range(len(pageobjs['pages'][id]['revisions'])): + if pageobjs['pages'][id]['revisions'][rv]['revid'] == int(revid): + # print rv + return True + wikipedia.output('* ' + revid) + return False + +cat = catlib.Category(wikipedia.getSite(), 'Category:%s' % wikipedia.translate(wikipedia.getSite(), reports_cat)) +gen = pagegenerators.CategorizedPageGenerator(cat, recurse = True) + +for page in gen: + data = page.get() + wikipedia.output(page.aslink()) + output = '' + + # + # Preserve text before of the sections + # + + m = re.search("(?m)^==\s*[^=]*?\s*==", data) + if m: + output = data[:m.end() + 1] + else: + m = re.search("(?m)^===\s*[^=]*?", data) + if not m: + continue + output = data[:m.start()] + + titles = headC.findall(data) + revids = rev_templateC.findall(data) + + query_results_titles = list() + query_results_revids = list() + + # No more of 100 titles at a time using Yurik's API + for s in mysplit(query.ListToParam(titles), 100, "|"): + query_results_titles.append(simplejson.loads(query_yurik_api(('titles', s)))) + for s in mysplit(query.ListToParam(revids), 100, "|"): + query_results_revids.append(simplejson.loads(query_yurik_api(('revids', s)))) + + comment_entry = list() + add_separator = False + index = 0 + + while True: + head = headC.search(data, index) + if not head: + break + index = head.end() + title = head.group(1) + next_head = next_headC.search(data, index) + if next_head: + if separatorC.search(data[next_head.start():next_head.end()]): + add_separator = True + stop = next_head.start() + else: + stop = len(data) + + exist = True + if page_exist(title): + # check {{botbox}} + revid = re.search("{{(?:/box|botbox)\|.*?\|(.*?)\|", data[head.end():stop]) + if revid: + if not revid_exist(revid.group(1)): + exist = False + else: + exist = False + + if exist: + output += "=== [[" + title + "]]" + data[head.end():stop] + else: + comment_entry.append("[[%s]]" % title) + + if add_separator: + output += data[next_head.start():next_head.end()] + '\n' + add_separator = False + + add_comment = u'%s: %s' % (wikipedia.translate(wikipedia.getSite(), summary_msg),", ".join(comment_entry)) + + # remove useless newlines + output = re.sub("(?m)^\n", "", output) + + if comment_entry: + wikipedia.output(add_comment) + if wikipedia.verbose: + wikipedia.showDiff(page.get(), output) + + if len(sys.argv)!=1: + choice = wikipedia.inputChoice(u'Do you want to clean the page?', ['Yes', 'No'], ['y', 'n'], 'n') + if choice in ['n', 'N']: + continue + try: + put(page, output, add_comment) + except wikipedia.PageNotSaved: + raise + +wikipedia.stopme() Modified: trunk/pywikipedia/copyright_put.py =================================================================== --- trunk/pywikipedia/copyright_put.py 2007-11-09 11:34:57 UTC (rev 4518) +++ trunk/pywikipedia/copyright_put.py 2007-11-09 11:40:26 UTC (rev 4519) @@ -1,252 +1,252 @@ -# -*- coding: utf-8 -*- -""" -""" - -# -# (C) Francesco Cosoleto, 2006 -# -# Distributed under the terms of the MIT license. -# - -import sys, re, codecs, os, time, shutil -import wikipedia, config, date - -from copyright import put, join_family_data, appdir, reports_cat - -# -# Month + Year save method -append_date_to_wiki_save_path = True - -# -# Add pubblication date to entries (template:botdate) -append_date_to_entries = False - -msg_table = { - 'it': {'_default': [u'Pagine nuove', u'Nuove voci'], - 'feed': [u'Aggiunte a voci esistenti', u'Testo aggiunto in']}, - 'en': {'_default': [u'New entries', u'New entries']} -} - -wiki_save_path = { - '_default': u'User:%s/Report' % config.usernames[wikipedia.getSite().family.name][wikipedia.getSite().lang], - 'it': u'Utente:RevertBot/Report' -} - -template_cat = { - '_default': [u'This template is used by copyright.py, a script part of [[:m:Using the python wikipediabot|PyWikipediaBot]].', u''], - 'it': [u'Questo template è usato dallo script copyright.py del [[:m:Using the python wikipediabot|PyWikipediaBot]].', u'Template usati da bot'], -} - -stat_msg = { - 'en': [u'Statistics', u'Page', u'Entries', u'Size', u'Total', 'Update'], - 'it': [u'Statistiche', u'Pagina', u'Segnalazioni', u'Lunghezza', u'Totale', u'Ultimo aggiornamento'], -} - -wiki_save_path = wikipedia.translate(wikipedia.getSite(), wiki_save_path) -template_cat = wikipedia.translate(wikipedia.getSite(), template_cat) -stat_wiki_save_path = '%s/%s' % (wiki_save_path, wikipedia.translate(wikipedia.getSite(), stat_msg)[0]) - -if append_date_to_wiki_save_path: - wiki_save_path += '_' + date.monthName(wikipedia.getSite().language(), time.localtime()[1]) + '_' + str(time.localtime()[0]) - -separatorC = re.compile('(?m)^== +') - -def set_template(name = None): - - site = wikipedia.getSite() - url = "%s://%s%s" % (site.protocol(), site.hostname(), site.path()) - - botdate = u""" -<div style="text-align:right">{{{1}}}</div><noinclude>%s\n[[%s:%s]]</noinclude> -""" % (template_cat[0], site.namespace(14), template_cat[1]) - - botbox = """ -<div class=plainlinks style="text-align:right">[%s?title={{{1}}}&diff={{{2}}}&oldid={{{3}}} diff] - [%s?title={{{1}}}&action=history cron] - [%s?title=Special:Log&page={{{1}}} log]</div><noinclude>%s\n[[%s:%s]]</noinclude> -""" % (url, url, url, template_cat[0], site.namespace(14), template_cat[1]) - - if name == 'botdate': - p = wikipedia.Page(site, 'Template:botdate') - if not p.exists(): - p.put(botdate, comment = 'Init.') - - if name == 'botbox': - p = wikipedia.Page(site, 'Template:botbox') - if not p.exists(): - p.put(botbox, comment = 'Init.') - -def stat_sum(engine, text): - return len(re.findall('(?im)^\*.*?' + engine + '.*?- ', text)) - -def get_stats(): - - import catlib, pagegenerators - - msg = wikipedia.translate(wikipedia.getSite(), stat_msg) - - cat = catlib.Category(wikipedia.getSite(), 'Category:%s' % wikipedia.translate(wikipedia.getSite(), reports_cat)) - gen = pagegenerators.CategorizedPageGenerator(cat, recurse = True) - - output = u"""{| {{prettytable|width=|align=|text-align=left}} -! %s -! %s -! %s -! %s -! %s -! %s -|- -""" % ( msg[1], msg[2], msg[3], 'Google', 'Yahoo', 'Live Search' ) - - gnt = 0 ; ynt = 0 ; mnt = 0 ; ent = 0 ; sn = 0 ; snt = 0 - - for page in gen: - data = page.get() - - gn = stat_sum('google', data) - yn = stat_sum('yahoo', data) - mn = stat_sum('(msn|live)', data) - - en = len(re.findall('=== \[\[', data)) - sn = len(data) - - gnt += gn ; ynt += yn ; mnt += mn ; ent += en ; snt += sn - - output += u"|%s||%s||%s KB||%s||%s||%s\n|-\n" % (page.aslink(), en, sn / 1024, gn, yn, mn) - - output += u"""| |||||||| -|- -|'''%s'''||%s||%s KB||%s||%s||%s -|- -|colspan="6" align=right style="background-color:#eeeeee;"|<small>''%s: %s''</small> -|} -""" % (msg[4], ent, snt / 1024, gnt, ynt, mnt, msg[5], time.strftime("%d " + "%s" % (date.monthName(wikipedia.getSite().language(), time.localtime()[1])) + " %Y")) - - return output - -def put_stats(): - page = wikipedia.Page(wikipedia.getSite(), stat_wiki_save_path) - page.put(get_stats(), comment = wikipedia.translate(wikipedia.getSite(), stat_msg)[0]) - -def output_files_gen(): - for f in os.listdir(appdir): - if 'output' in f and not '_pending' in f: - m = re.search('output_(.*?)\.txt', f) - if m: - tag = m.group(1) - else: - tag = '_default' - - section_name_and_summary = wikipedia.translate(wikipedia.getSite(), msg_table)[tag] - - section = section_name_and_summary[0] - summary = section_name_and_summary[1] - - yield os.path.join(appdir, f), section, summary - -def read_output_file(filename): - if os.path.isfile(filename + '_pending'): - shutil.move(filename, filename + '_temp') - ap = codecs.open(filename + '_pending', 'a', 'utf-8') - ot = codecs.open(filename + '_temp', 'r', 'utf-8') - ap.write(ot.read()) - ap.close() - ot.close() - os.remove(filename + '_temp') - else: - shutil.move(filename, filename + '_pending') - - f = codecs.open(filename + '_pending', 'r', 'utf-8') - data = f.read() - f.close() - - return data - -def run(send_stats = False): - page = wikipedia.Page(wikipedia.getSite(), wiki_save_path) - - try: - wikitext = page.get() - except wikipedia.NoPage: - wikipedia.output("%s not found." % page.aslink()) - wikitext = '[[%s:%s]]\n' % (wikipedia.getSite().namespace(14), wikipedia.translate(wikipedia.getSite(), reports_cat)) - - final_summary = u'' - output_files = list() - - for f, section, summary in output_files_gen(): - wikipedia.output('File: \'%s\'\nSection: %s\n' % (f, section)) - - output_data = read_output_file(f) - output_files.append(f) - - entries = re.findall('=== (.*?) ===', output_data) - - if not entries: - continue - - if append_date_to_entries: - dt = time.strftime('%d-%m-%Y %H:%M', time.localtime()) - output_data = re.sub("(?m)^(=== \[\[.*?\]\] ===\n)", r"\1{{botdate|%s}}\n" % dt, output_data) - - m = re.search('(?m)^==\s*%s\s*==' % section, wikitext) - if m: - m_end = re.search(separatorC, wikitext[m.end():]) - if m_end: - wikitext = wikitext[:m_end.start() + m.end()] + output_data + wikitext[m_end.start() + m.end():] - else: - wikitext += '\n' + output_data - else: - wikitext += '\n' + output_data - - if final_summary: - final_summary += ' ' - final_summary += u'%s: %s' % (summary, ', '.join(entries)) - - if final_summary: - wikipedia.output(final_summary + '\n') - - # if a page in 'Image' or 'Category' namespace is checked then fix - # title section by adding ':' in order to avoid wiki code effects. - - wikitext = re.sub(u'(?i)=== \[\[%s:' % join_family_data('Image', 6), ur'== [[:\1:', wikitext) - wikitext = re.sub(u'(?i)=== \[\[%s:' % join_family_data('Category', 14), ur'== [[:\1:', wikitext) - - # TODO: - # List of frequent rejected address to improve upload process. - - wikitext = re.sub('http://(.*?)((forumcommunity|forumfree).net)',r'<blacklist>\1\2', wikitext) - - if len(final_summary)>=200: - final_summary = final_summary[:200] - final_summary = final_summary[:final_summary.rindex("[")-3] + "..." - - try: - put(page, wikitext, comment = final_summary) - for f in output_files: - os.remove(f + '_pending') - wikipedia.output("\'%s\' deleted." % f) - except wikipedia.PageNotSaved: - raise - - if append_date_to_entries: - set_template(name = 'botdate') - if '{{botbox' in wikitext: - set_template(name = 'botbox') - - if send_stats: - put_stats() - -def main(): - # - # Send statistics - send_stats = False - - for arg in wikipedia.handleArgs(): - if arg == "-stats": - send_stats = True - run(send_stats = send_stats) - -if __name__ == "__main__": - try: - main() - finally: +# -*- coding: utf-8 -*- +""" +""" + +# +# (C) Francesco Cosoleto, 2006 +# +# Distributed under the terms of the MIT license. +# + +import sys, re, codecs, os, time, shutil +import wikipedia, config, date + +from copyright import put, join_family_data, appdir, reports_cat + +# +# Month + Year save method +append_date_to_wiki_save_path = True + +# +# Add pubblication date to entries (template:botdate) +append_date_to_entries = False + +msg_table = { + 'it': {'_default': [u'Pagine nuove', u'Nuove voci'], + 'feed': [u'Aggiunte a voci esistenti', u'Testo aggiunto in']}, + 'en': {'_default': [u'New entries', u'New entries']} +} + +wiki_save_path = { + '_default': u'User:%s/Report' % config.usernames[wikipedia.getSite().family.name][wikipedia.getSite().lang], + 'it': u'Utente:RevertBot/Report' +} + +template_cat = { + '_default': [u'This template is used by copyright.py, a script part of [[:m:Using the python wikipediabot|PyWikipediaBot]].', u''], + 'it': [u'Questo template è usato dallo script copyright.py del [[:m:Using the python wikipediabot|PyWikipediaBot]].', u'Template usati da bot'], +} + +stat_msg = { + 'en': [u'Statistics', u'Page', u'Entries', u'Size', u'Total', 'Update'], + 'it': [u'Statistiche', u'Pagina', u'Segnalazioni', u'Lunghezza', u'Totale', u'Ultimo aggiornamento'], +} + +wiki_save_path = wikipedia.translate(wikipedia.getSite(), wiki_save_path) +template_cat = wikipedia.translate(wikipedia.getSite(), template_cat) +stat_wiki_save_path = '%s/%s' % (wiki_save_path, wikipedia.translate(wikipedia.getSite(), stat_msg)[0]) + +if append_date_to_wiki_save_path: + wiki_save_path += '_' + date.monthName(wikipedia.getSite().language(), time.localtime()[1]) + '_' + str(time.localtime()[0]) + +separatorC = re.compile('(?m)^== +') + +def set_template(name = None): + + site = wikipedia.getSite() + url = "%s://%s%s" % (site.protocol(), site.hostname(), site.path()) + + botdate = u""" +<div style="text-align:right">{{{1}}}</div><noinclude>%s\n[[%s:%s]]</noinclude> +""" % (template_cat[0], site.namespace(14), template_cat[1]) + + botbox = """ +<div class=plainlinks style="text-align:right">[%s?title={{{1}}}&diff={{{2}}}&oldid={{{3}}} diff] - [%s?title={{{1}}}&action=history cron] - [%s?title=Special:Log&page={{{1}}} log]</div><noinclude>%s\n[[%s:%s]]</noinclude> +""" % (url, url, url, template_cat[0], site.namespace(14), template_cat[1]) + + if name == 'botdate': + p = wikipedia.Page(site, 'Template:botdate') + if not p.exists(): + p.put(botdate, comment = 'Init.') + + if name == 'botbox': + p = wikipedia.Page(site, 'Template:botbox') + if not p.exists(): + p.put(botbox, comment = 'Init.') + +def stat_sum(engine, text): + return len(re.findall('(?im)^\*.*?' + engine + '.*?- ', text)) + +def get_stats(): + + import catlib, pagegenerators + + msg = wikipedia.translate(wikipedia.getSite(), stat_msg) + + cat = catlib.Category(wikipedia.getSite(), 'Category:%s' % wikipedia.translate(wikipedia.getSite(), reports_cat)) + gen = pagegenerators.CategorizedPageGenerator(cat, recurse = True) + + output = u"""{| {{prettytable|width=|align=|text-align=left}} +! %s +! %s +! %s +! %s +! %s +! %s +|- +""" % ( msg[1], msg[2], msg[3], 'Google', 'Yahoo', 'Live Search' ) + + gnt = 0 ; ynt = 0 ; mnt = 0 ; ent = 0 ; sn = 0 ; snt = 0 + + for page in gen: + data = page.get() + + gn = stat_sum('google', data) + yn = stat_sum('yahoo', data) + mn = stat_sum('(msn|live)', data) + + en = len(re.findall('=== \[\[', data)) + sn = len(data) + + gnt += gn ; ynt += yn ; mnt += mn ; ent += en ; snt += sn + + output += u"|%s||%s||%s KB||%s||%s||%s\n|-\n" % (page.aslink(), en, sn / 1024, gn, yn, mn) + + output += u"""| |||||||| +|- +|'''%s'''||%s||%s KB||%s||%s||%s +|- +|colspan="6" align=right style="background-color:#eeeeee;"|<small>''%s: %s''</small> +|} +""" % (msg[4], ent, snt / 1024, gnt, ynt, mnt, msg[5], time.strftime("%d " + "%s" % (date.monthName(wikipedia.getSite().language(), time.localtime()[1])) + " %Y")) + + return output + +def put_stats(): + page = wikipedia.Page(wikipedia.getSite(), stat_wiki_save_path) + page.put(get_stats(), comment = wikipedia.translate(wikipedia.getSite(), stat_msg)[0]) + +def output_files_gen(): + for f in os.listdir(appdir): + if 'output' in f and not '_pending' in f: + m = re.search('output_(.*?)\.txt', f) + if m: + tag = m.group(1) + else: + tag = '_default' + + section_name_and_summary = wikipedia.translate(wikipedia.getSite(), msg_table)[tag] + + section = section_name_and_summary[0] + summary = section_name_and_summary[1] + + yield os.path.join(appdir, f), section, summary + +def read_output_file(filename): + if os.path.isfile(filename + '_pending'): + shutil.move(filename, filename + '_temp') + ap = codecs.open(filename + '_pending', 'a', 'utf-8') + ot = codecs.open(filename + '_temp', 'r', 'utf-8') + ap.write(ot.read()) + ap.close() + ot.close() + os.remove(filename + '_temp') + else: + shutil.move(filename, filename + '_pending') + + f = codecs.open(filename + '_pending', 'r', 'utf-8') + data = f.read() + f.close() + + return data + +def run(send_stats = False): + page = wikipedia.Page(wikipedia.getSite(), wiki_save_path) + + try: + wikitext = page.get() + except wikipedia.NoPage: + wikipedia.output("%s not found." % page.aslink()) + wikitext = '[[%s:%s]]\n' % (wikipedia.getSite().namespace(14), wikipedia.translate(wikipedia.getSite(), reports_cat)) + + final_summary = u'' + output_files = list() + + for f, section, summary in output_files_gen(): + wikipedia.output('File: \'%s\'\nSection: %s\n' % (f, section)) + + output_data = read_output_file(f) + output_files.append(f) + + entries = re.findall('=== (.*?) ===', output_data) + + if not entries: + continue + + if append_date_to_entries: + dt = time.strftime('%d-%m-%Y %H:%M', time.localtime()) + output_data = re.sub("(?m)^(=== \[\[.*?\]\] ===\n)", r"\1{{botdate|%s}}\n" % dt, output_data) + + m = re.search('(?m)^==\s*%s\s*==' % section, wikitext) + if m: + m_end = re.search(separatorC, wikitext[m.end():]) + if m_end: + wikitext = wikitext[:m_end.start() + m.end()] + output_data + wikitext[m_end.start() + m.end():] + else: + wikitext += '\n' + output_data + else: + wikitext += '\n' + output_data + + if final_summary: + final_summary += ' ' + final_summary += u'%s: %s' % (summary, ', '.join(entries)) + + if final_summary: + wikipedia.output(final_summary + '\n') + + # if a page in 'Image' or 'Category' namespace is checked then fix + # title section by adding ':' in order to avoid wiki code effects. + + wikitext = re.sub(u'(?i)=== \[\[%s:' % join_family_data('Image', 6), ur'== [[:\1:', wikitext) + wikitext = re.sub(u'(?i)=== \[\[%s:' % join_family_data('Category', 14), ur'== [[:\1:', wikitext) + + # TODO: + # List of frequent rejected address to improve upload process. + + wikitext = re.sub('http://(.*?)((forumcommunity|forumfree).net)',r'<blacklist>\1\2', wikitext) + + if len(final_summary)>=200: + final_summary = final_summary[:200] + final_summary = final_summary[:final_summary.rindex("[")-3] + "..." + + try: + put(page, wikitext, comment = final_summary) + for f in output_files: + os.remove(f + '_pending') + wikipedia.output("\'%s\' deleted." % f) + except wikipedia.PageNotSaved: + raise + + if append_date_to_entries: + set_template(name = 'botdate') + if '{{botbox' in wikitext: + set_template(name = 'botbox') + + if send_stats: + put_stats() + +def main(): + # + # Send statistics + send_stats = False + + for arg in wikipedia.handleArgs(): + if arg == "-stats": + send_stats = True + run(send_stats = send_stats) + +if __name__ == "__main__": + try: + main() + finally: wikipedia.stopme() \ No newline at end of file

1 0

[Pywikipedia-l] AllpagesPageGenerator doesn't start in correct location for NS !=0
by Mak Thorpe 09 Nov '07

09 Nov '07

Bug: In AllpagesPageGenerator, the code determines namespace from the start parameter, but then proceeds in alphabetical order from the start value's namespace name. EG: -start:Template:100 begins processing at Template:Template documentation, not Template:! Proposed Fix: Current: if namespace==None: namespace = wikipedia.Page(wikipedia.getSite(), start).namespace() Add: m = re.search('.+:(.)*',start, flags = 0) if m: start = m.group(1) -Mak

2 1

[Pywikipedia-l] SVN: [4518] trunk/pywikipedia/pagegenerators.py
by huji＠svn.wikimedia.org 09 Nov '07

09 Nov '07

Revision: 4518 Author: huji Date: 2007-11-09 11:34:57 +0000 (Fri, 09 Nov 2007) Log Message: ----------- Fixing a bug reported on mailing list: AllPageGenerator starts from the first letter of the namespace and not the title, when the parameter is formatted as -start:Namespace:Title Modified Paths: -------------- trunk/pywikipedia/pagegenerators.py Modified: trunk/pywikipedia/pagegenerators.py =================================================================== --- trunk/pywikipedia/pagegenerators.py 2007-11-09 10:32:47 UTC (rev 4517) +++ trunk/pywikipedia/pagegenerators.py 2007-11-09 11:34:57 UTC (rev 4518) @@ -114,7 +114,8 @@ """ if namespace==None: namespace = wikipedia.Page(wikipedia.getSite(), start).namespace() - for page in wikipedia.getSite().allpages(start=start, namespace=namespace, includeredirects = includeredirects): + title = wikipedia.Page(wikipedia.getSite(), start).titleWithoutNamespace() + for page in wikipedia.getSite().allpages(start=title, namespace=namespace, includeredirects = includeredirects): yield page def PrefixingPageGenerator(prefix, namespace=None):

1 0

[Pywikipedia-l] SVN: [4517] trunk/pywikipedia/imagecopy.py
by siebrand＠svn.wikimedia.org 09 Nov '07

09 Nov '07

Revision: 4517 Author: siebrand Date: 2007-11-09 10:32:47 +0000 (Fri, 09 Nov 2007) Log Message: ----------- Feature request: make it use pagegenerators.py Modified Paths: -------------- trunk/pywikipedia/imagecopy.py Modified: trunk/pywikipedia/imagecopy.py =================================================================== --- trunk/pywikipedia/imagecopy.py 2007-11-08 13:14:24 UTC (rev 4516) +++ trunk/pywikipedia/imagecopy.py 2007-11-09 10:32:47 UTC (rev 4517) @@ -24,6 +24,7 @@ -start Start at index within category (optional) Known issues/FIXMEs (no critical issues known): +* make it use pagegenerators.py * Some variable names are in Spanish, which makes the code harder to read. * Depending on sorting within a file category, the "next batch" is sometimes not working, leading to an endless loop

1 0

[Pywikipedia-l] Fwd: Rewrite thoughts (includes replies to the4507-thread)
by Huji 09 Nov '07

09 Nov '07

In that sense, I think we should change what valhallasw originally suggested to something like "all files must be in UTF-8 format, without BOM. They must start with "# -*- coding: utf-8 -*-" (without quotes) as their first line..." Hojjat (aka Huji) On 11/7/07, Russell Blau <russblau(a)imapmail.org> wrote: > > Huji wrote: > > > About BOM, I hope every editor has a way to add it to the beginning of > the > > file. (In MediaWiki > > codes, when a BOM was added by notepad, I had to remove it to make the > > code work > > correct, and it was a pain in ass; now, I have this pessimistic feeling > > about adding it for > > Pywikipedia). > > UTF-8 files should not contain a BOM. According to the Unicode BOM > FAQ[1], > "UTF-8 can contain a BOM. However, it makes no difference as to the > endianness of the byte stream. UTF-8 always has the same byte order. An > initial BOM is only used as a signature — an indication that an otherwise > unmarked text file is in UTF-8. Note that some recipients of UTF-8 encoded > > data do not expect a BOM. Where UTF-8 is used transparently in 8-bit > environments, the use of a BOM will interfere with any protocol or file > format that expects specific ASCII characters at the beginning, such as > the > use of "#!" of at the beginning of Unix shell scripts." In Python, the > encoding is specified by an explicit "# -*- coding" line; not only is > there > no need for a BOM, but having one there screws up Python's interpretation > of > the file. > > [1] http://unicode.org/faq/utf_bom.html#25 > > Russ > >

1 0

← Newer
1
...
21
22
23
24
25
26
27
28
29
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot November 2007