Patches item #1840253, was opened at 2007-11-28 14:13
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1840253&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: André Malafaya Baptista (malafaya)
Assigned to: Nobody/Anonymous (nobody)
Summary: #REDIRECT alias for Javanese (jv)
Initial Comment:
Please apply attached patch which includes an alias for the #Redirect magic word to language jv - Javanese. Thanks.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1840253&group_…
Patches item #1840148, was opened at 2007-11-28 11:50
Message generated for change (Comment added) made by rotemliss
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1840148&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Alex S.H. Lin (lin4h)
Assigned to: Nobody/Anonymous (nobody)
Summary: update clean_sandbox.py Info for jawiki
Initial Comment:
as title.
----------------------------------------------------------------------
Comment By: Rotem Liss (rotemliss)
Date: 2007-11-28 12:00
Message:
Logged In: YES
user_id=1327030
Originator: NO
Added in r4607.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1840148&group_…
Patches item #1840148, was opened at 2007-11-28 17:50
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1840148&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Alex S.H. Lin (lin4h)
Assigned to: Nobody/Anonymous (nobody)
Summary: update clean_sandbox.py Info for jawiki
Initial Comment:
as title.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1840148&group_…
Bugs item #1829405, was opened at 2007-11-09 23:47
Message generated for change (Comment added) made by sf-robot
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1829405&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: Invalid
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Leonardo Gregianin (leogregianin)
Summary: redirect.py says pages don't exist
Initial Comment:
When inputting "python redirect.py double", 120 redirects are found, and each is individually opened, but instead of fixing them, it returns "[[PAGENAME]] doesn't exist." instead.
See the attached screenshot
anotherpeteparker(a)gmail.com
----------------------------------------------------------------------
>Comment By: SourceForge Robot (sf-robot)
Date: 2007-11-27 19:20
Message:
Logged In: YES
user_id=1312539
Originator: NO
This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).
----------------------------------------------------------------------
Comment By: Leonardo Gregianin (leogregianin)
Date: 2007-11-13 08:47
Message:
Logged In: YES
user_id=1136737
Originator: NO
The list of double redirects in wikipedia is save in cached, this article
already was deleted but the Special:doubleredirects list not up to date.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1829405&group_…
Revision: 4605
Author: filnik
Date: 2007-11-27 16:23:45 +0000 (Tue, 27 Nov 2007)
Log Message:
-----------
Bugfix and adding hu language
Modified Paths:
--------------
trunk/pywikipedia/checkimages.py
Modified: trunk/pywikipedia/checkimages.py
===================================================================
--- trunk/pywikipedia/checkimages.py 2007-11-27 16:20:41 UTC (rev 4604)
+++ trunk/pywikipedia/checkimages.py 2007-11-27 16:23:45 UTC (rev 4605)
@@ -82,12 +82,14 @@
'commons':'\n{{subst:nld}}',
'en' :'\n{{subst:nld}}',
'it' :'\n{{subst:unverdata}}',
+ 'hu' :u'\n{{nincslicenc|~~~~~}}',
}
txt_find = {
'commons':['{{no license', '{{nld'],
'en':['{{nld', '{{no license'],
'it':['{{unverdata', '{{unverified'],
+ 'hu':[u'{{nincsforrás',u'{{nincslicenc'],
}
# Summary for when the will add the no source
@@ -95,6 +97,7 @@
'commons':'Bot: Marking newly uploaded untagged file',
'en' :'Bot: Marking newly uploaded untagged file',
'it' :"Bot: Aggiungo unverified",
+ 'hu' :'Robot: Frissen feltöltött licencsablon nélküli fájl megjelölése',
}
# Summary that the bot use when it notify the problem with the image's license
@@ -102,6 +105,7 @@
'commons':"Bot: Requesting source information." ,
'en' :"Bot: Requesting source information." ,
'it' :"Bot: Notifico l'unverified",
+ 'hu' :'Robot: Forrásinformáció kérése',
}
# When the Bot find that the usertalk is empty is not pretty to put only the no source without the welcome, isn't it?
@@ -109,6 +113,7 @@
'commons':'{{subst:welcome}}\n~~~~\n',
'en' :'{{welcome}}\n~~~~\n',
'it' :'{{benvenuto}}\n~~~~\n',
+ 'hu' :u'{{subst:Üdvözlet|~~~~}}\n',
}
# General summary
@@ -116,6 +121,7 @@
'commons':'Bot: no source',
'en' :'Bot: no source',
'it' :'Bot: Unverified!',
+ 'hu' :'Robot: nincs forrás',
}
# if the file has an unknown extension it will be tagged with this template.
@@ -124,6 +130,7 @@
'commons':"{{db-meta|The file has .%s as extension.}}",
'en' :"{{db-meta|The file has .%s as extension.}}",
'it' :'{{cancella subito|motivo=Il file ha come estensione ".%s"}}',
+ 'hu' :u'{{azonnali|A fájlnak .%s a kiterjesztése}}',
}
# The header of the Unknown extension's message.
@@ -131,6 +138,7 @@
'commons':"\n== Unknown extension! ==\n",
'en' :"\n== Unknown extension! ==\n",
'it' :'\n== File non specificato ==\n',
+ 'hu' :u'\n== Ismeretlen kiterjesztésű fájl ==\n',
}
# Text that will be add if the bot find a unknown extension.
@@ -138,12 +146,14 @@
'commons':'The [[:Image:%s]] file has a wrong extension, please check. ~~~~',
'en' :'The [[:Image:%s]] file has a wrong extension, please check. ~~~~',
'it' :'{{subst:Utente:Filbot/Ext|%s}}',
+ 'hu' :u'A [[:Kép:%s]] fájlnak rossz a kiterjesztése, kérlek ellenőrízd. ~~~~',
}
# Summary of the delate immediately. (f.e: Adding {{db-meta|The file has .%s as extension.}})
del_comm = {
'commons':'Bot: Adding %s',
'en' :'Bot: Adding %s',
'it' :'Bot: Aggiungo %s',
+ 'hu' :u'Robot:"%s" hozzáadása',
}
# This is the most important header, because it will be used a lot. That's the header that the bot
@@ -152,12 +162,14 @@
'commons':"",# Nothing, the template has already the header inside.
'en' :"\n== Image without license ==\n",
'it' :"\n== Immagine senza licenza ==\n",
+ 'hu' :u"\n== Licenc nélküli kép ==\n",
}
# That's the text that the bot will add if it doesn't find the license.
nothing_notification = {
'commons':"{{subst:User:Filnik/untagged|Image:%s}}Image:%s}}\n\n''This message was '''added automatically by [[User:Filbot|Filbot]]''', if you need some help about it, ask [[User:Filnik|its master]] or go to the [[Commons:Help desk]]''. --~~~~",
'en' :"{{subst:image source|Image:%s}} --~~~~",
'it' :"{{subst:Utente:Filbot/Senza licenza|%s}} --~~~~",
+ 'hu' :u"{{subst:adjforrást|Kép:%s}} \n Ezt az üzenetet ~~~ automatikusan helyezte el a vitalapodon, kérdéseddel fordulj a gazdájához, vagy a [[WP:KF|Kocsmafalhoz]]. --~~~~",
}
# This is a list of what bots used this script in your project.
# NOTE: YOUR Botnick is automatically added. It's not required to add it twice.
@@ -172,12 +184,14 @@
'commons':None,
'en': None,
'it':'{{subst:Utente:Filbot/Senza licenza2|%s}} --~~~~',
+ 'hu':u'\nSzia! Úgy tűnik a [[:Kép:%s]] képpel is hasonló a probléma, mint az előbbivel. Kérlek olvasd el a [[WP:KÉPLIC|feltölthető képek]]ről szóló oldalunk, és segítségért fordulj a [[WP:KF-JO|Jogi kocsmafalhoz]]. Köszönöm --~~~~',
}
# You can add some settings to wikipedia. In this way, you can change them without touch the code.
# That's useful if you are running the bot on Toolserver.
page_with_settings = {
'commons':None,
'en':None,
+ 'hu':None,
'it':'Utente:Nikbot/Settings#Settings',
}
# The bot can report some images (like the images that have the same name of an image on commons)
@@ -186,6 +200,7 @@
'commons':'User:Filbot/Report',
'en' :'User:Filnik/Report',
'it' :'Utente:Nikbot/Report',
+ 'hu' :'User:Bdamokos/Report',
}
# Adding the date after the signature.
timeselected = u' ~~~~~'
@@ -194,12 +209,14 @@
'commons':"\n*[[:Image:%s]] " + timeselected,
'en':"\n*[[:Image:%s]] " + timeselected,
'it':"\n*[[:Immagine:%s]] " + timeselected,
+ 'hu':u"\n*[[:Kép:%s]] " + timeselected,
}
# The summary of the report
comm10 = {
'commons':'Bot: Updating the log',
'en':'Bot: Updating the log',
'it':'Bot: Aggiorno il log',
+ 'hu': 'Robot: A napló frissítése',
}
# If a template isn't a license but it's included on a lot of images, that can be skipped to
@@ -208,10 +225,11 @@
'commons':['{{information'],
'en':['{{information'],
'it':['{{edp', '{{informazioni file', '{{information'],
+ 'hu':[u'{{információ','{{enwiki', '{{azonnali'],
}
# Add your project (in alphabetical order) if you want that the bot start
-project_inserted = ['commons', 'en', 'it']
+project_inserted = ['commons', 'en','hu', 'it']
# Ok, that's all. What is below, is the rest of code, now the code is fixed and it will run correctly in your project.
#########################################################################################################################
@@ -464,37 +482,40 @@
def takesettings(self, settings):
pos = 0
- x = wikipedia.Page(self.site, settings)
- lista = list()
- try:
- testo = x.get()
- rxp = "<------- ------->\n\*[Nn]ame=['\"](.*?)['\"]\n\*([Ff]ind|[Ff]indonly)=(.*?)\n\*[Ii]magechanges=(.*?)\n\*[Ss]ummary=['\"](.*?)['\"]\n\*[Hh]ead=['\"](.*?)['\"]\n\*[Tt]ext ?= ?['\"](.*?)['\"]\n\*[Mm]ex ?= ?['\"]?(.*?)['\"]?$"
- r = re.compile(rxp, re.UNICODE|re.M)
- number = 1
- while 1:
- m = r.search(testo, pos)
- if m == None:
- if lista == list():
- wikipedia.output(u"You've set wrongly your settings, please take a look to the relative page. (run without them)")
- lista = None
- else:
- break
- else:
- pos = m.end()
- name = str(m.group(1))
- find_tipe = str(m.group(2))
- find = str(m.group(3))
- imagechanges = str(m.group(4))
- summary = str(m.group(5))
- head = str(m.group(6))
- text = str(m.group(7))
- mexcatched = str(m.group(8))
- tupla = [number, name, find_tipe, find, imagechanges, summary, head, text, mexcatched]
- lista += [tupla]
- number += 1
- except wikipedia.NoPage:
- lista = None
- return lista
+ if settings != None:
+ x = wikipedia.Page(self.site, settings)
+ lista = list()
+ try:
+ testo = x.get()
+ rxp = "<------- ------->\n\*[Nn]ame=['\"](.*?)['\"]\n\*([Ff]ind|[Ff]indonly)=(.*?)\n\*[Ii]magechanges=(.*?)\n\*[Ss]ummary=['\"](.*?)['\"]\n\*[Hh]ead=['\"](.*?)['\"]\n\*[Tt]ext ?= ?['\"](.*?)['\"]\n\*[Mm]ex ?= ?['\"]?(.*?)['\"]?$"
+ r = re.compile(rxp, re.UNICODE|re.M)
+ number = 1
+ while 1:
+ m = r.search(testo, pos)
+ if m == None:
+ if lista == list():
+ wikipedia.output(u"You've set wrongly your settings, please take a look to the relative page. (run without them)")
+ lista = None
+ else:
+ break
+ else:
+ pos = m.end()
+ name = str(m.group(1))
+ find_tipe = str(m.group(2))
+ find = str(m.group(3))
+ imagechanges = str(m.group(4))
+ summary = str(m.group(5))
+ head = str(m.group(6))
+ text = str(m.group(7))
+ mexcatched = str(m.group(8))
+ tupla = [number, name, find_tipe, find, imagechanges, summary, head, text, mexcatched]
+ lista += [tupla]
+ number += 1
+ except wikipedia.NoPage:
+ lista = None
+ return lista
+ else:
+ return []
def load(self, raw):
list_loaded = list()
Revision: 4604
Author: siebrand
Date: 2007-11-27 16:20:41 +0000 (Tue, 27 Nov 2007)
Log Message:
-----------
Rename lower case
Added Paths:
-----------
trunk/pywikipedia/add_text.py
Removed Paths:
-------------
trunk/pywikipedia/AddText.py
Deleted: trunk/pywikipedia/AddText.py
===================================================================
--- trunk/pywikipedia/AddText.py 2007-11-27 16:11:27 UTC (rev 4603)
+++ trunk/pywikipedia/AddText.py 2007-11-27 16:20:41 UTC (rev 4604)
@@ -1,284 +0,0 @@
-#!/usr/bin/python
-# -*- coding: utf-8 -*-
-"""
-This is a Bot written by Filnik to add a text in a given category.
-
---- GenFactory Generator is used ---
--start Define from which page should the Bot start
--ref Use the ref as generator
--cat Use a category as generator
--filelinks Use all the links to an image as generator
--unusedfiles
--unwatched
--withoutinterwiki
--interwiki
--file
--uncatfiles
--uncatcat
--uncat
--subcat
--transcludes Use all the page that transclude a certain page as generator
--weblink Use the pages with a certain web link as generator
--links Use the links from a certain page as generator
--regex Only work on pages whose titles match the given regex
-
---- Other parameters ---
--page Use a page as generator
--text Define which text add
--summary Define the summary to use
--except Use a regex to understand if the template is already in the page
--excepturl Use the html page as text where you want to see if there's the text, not the wiki-page.
--newimages Add text in the new images
--untagged Add text in the images that doesn't have any license template
--always If used, the bot won't asked if it should add the text specified
-"""
-
-#
-# (C) Filnik, 2007
-#
-# Distributed under the terms of the MIT license.
-#
-__version__ = '$Id: AddText.py,v 1.0 2007/11/27 17:08:30 filnik Exp$'
-#
-
-import re, pagegenerators, urllib2, urllib
-import wikipedia, catlib
-
-class NoEnoughData(wikipedia.Error):
- """ Error class for when the user doesn't specified all the data needed """
-
-class NothingFound(wikipedia.Error):
- """ An exception indicating that a regex has return [] instead of results."""
-
-def pageText(url):
- try:
- request = urllib2.Request(url)
- user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7'
- request.add_header("User-Agent", user_agent)
- response = urllib2.urlopen(request)
- text = response.read()
- response.close()
- # When you load to many users, urllib2 can give this error.
- except urllib2.HTTPError:
- wikipedia.output(u"Server error. Pausing for 10 seconds... " + time.strftime("%d %b %Y %H:%M:%S (UTC)", time.gmtime()) )
- time.sleep(10)
- request = urllib2.Request(url)
- user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7'
- request.add_header("User-Agent", user_agent)
- response = urllib2.urlopen(request)
- text = response.read()
- response.close()
- return text
-
-def untaggedGenerator(untaggedProject, limit = 500):
- lang = untaggedProject.split('.', 1)[0]
- project = '.' + untaggedProject.split('.', 1)[1]
- if lang == 'commons':
- link = 'http://tools.wikimedia.de/~daniel/WikiSense/UntaggedImages.php?wikifam=comm…'
- else:
- link = 'http://tools.wikimedia.de/~daniel/WikiSense/UntaggedImages.php?wikilang=' + lang + '&wikifam=' + project + '&order=img_timestamp&max=' + str(limit) + '&ofs=0&max=' + str(limit)
- text = pageText(link)
- #print text
- regexp = r"""<td valign='top' title='Name'><a href='http://.*?\..*?\.org/w/index\.php\?title=(.*?)'>.*?</a></td>"""
- results = re.findall(regexp, text)
- if results == []:
- print link
- raise NothingFound('Nothing found! Try to use the tool by yourself to be sure that it works!')
- else:
- for result in results:
- yield wikipedia.Page(self.site, result)
-
-def newImages(limit):
- # Search regular expression to find links like this (and the class attribute is optional too)
- # class="new" title="Immagine:Soldatino2.jpg">Immagine:Soldatino2.jpg</a>" <span class="comment">
- url = "/w/index.php?title=Special:Log&type=upload&user=&page=&pattern=&limit=%d&offset=0" % int(limit)
- site = wikipedia.getSite()
- textrun = site.getUrl(url)
- image_namespace = site.image_namespace() + ":"
- regexp = r'(class=\"new\" |)title=\"' + image_namespace + '(.*?)\.(\w\w\w|jpeg)\">.*?</a>\".*?<span class=\"comment\">'
- pos = 0
- done = list()
- ext_list = list()
- r = re.compile(regexp, re.UNICODE)
- while 1:
- m = r.search(textrun, pos)
- if m == None:
- wikipedia.output(u"\t\t>> All images checked. <<")
- break
- pos = m.end()
- new = m.group(1)
- im = m.group(2)
- ext = m.group(3)
- # This prevent pages with strange characters. They will be loaded without problem.
- image = im + "." + ext
- if new != '':
- wikipedia.output(u"Skipping %s because it has been deleted." % image)
- done.append(image)
- if image not in done:
- done.append(image)
- yield wikipedia.Page(site, 'Image:%s' % image)
-
-def main():
- starsList = ['link[ _]fa', 'link[ _]adq', 'enllaç[ _]ad',
- 'link[ _]ua', 'legătură[ _]af', 'destacado',
- 'ua', 'liên k[ _]t[ _]chọn[ _]lọc']
- summary = None
- addText = None
- regexSkip = None
- always = False
- exceptUrl = False
- genFactory = pagegenerators.GeneratorFactory()
- errorCount = 0
-
- for arg in wikipedia.handleArgs():
- if arg.startswith('-text'):
- if len(arg) == 5:
- addText = wikipedia.input(u'What text do you want to add?')
- else:
- addText = arg[6:]
- elif arg.startswith('-summary'):
- if len(arg) == 8:
- summary = wikipedia.input(u'What summary do you want to use?')
- else:
- summary = arg[9:]
- elif arg.startswith('-page'):
- if len(arg) == 5:
- generator = list(wikipedia.input(u'What page do you want to use?'))
- else:
- generator = listr(arg[6:])
- elif arg.startswith('-excepturl'):
- exceptUrl = True
- if len(arg) == 10:
- regexSkip = wikipedia.input(u'What text should I skip?')
- else:
- regexSkip = arg[11:]
- elif arg.startswith('-except'):
- if len(arg) == 7:
- regexSkip = wikipedia.input(u'What text should I skip?')
- else:
- regexSkip = arg[8:]
- elif arg.startswith('-untagged'):
- if len(arg) == 9:
- untaggedProject = wikipedia.input(u'What project do you want to use?')
- else:
- untaggedProject = arg[10:]
- generator = untaggedGenerator(untaggedProject)
- elif arg.startswith('-newimages'):
- if len(arg) == 10:
- limit = wikipedia.input(u'How many images do you want to check?')
- else:
- limit = arg[11:]
- generator = newImages(limit)
- elif arg == '-always':
- always = True
- else:
- generator = genFactory.handleArg(arg)
-
- site = wikipedia.getSite()
- pathWiki = site.family.nicepath(site.lang)
- if not generator:
- raise NoEnoughData('You have to specify the generator you want to use for the script!')
- if not addText:
- raise NoEnoughData('You have to specify what text you want to add!')
- if not summary:
- summary = 'Bot: Adding %s' % addText
- for page in generator:
- wikipedia.output(u'Loading %s...' % page.title())
- try:
- text = page.get()
- except wikipedia.NoPage:
- wikipedia.output(u"%s doesn't exist, skip!" % page.title())
- continue
- except wikipedia.IsRedirectPage:
- wikipedia.output(u"%s is a redirect, skip!" % page.title())
- continue
- if regexSkip and exceptUrl:
- url = '%s%s' % (pathWiki, page.urlname())
- result = re.findall(regexSkip, site.getUrl(url))
- elif regexSkip:
- result = re.findall(regexSkip, text)
- else:
- result = []
- if result != []:
- wikipedia.output(u'Exception! regex (or word) use with -except, is in the page. Skip!')
- continue
- newtext = text
- categoryNamespace = site.namespace(14)
- regexpCat = re.compile(r'\[\[((?:category|%s):.*?)\]\]' % categoryNamespace.lower(), re.I)
- categorieInside = regexpCat.findall(text)
- newtext = wikipedia.removeCategoryLinks(newtext, site)
- interwikiInside = page.interwiki()
- interwikiList = list()
- for paginetta in interwikiInside:
- nome = str(paginetta).split('[[')[1].split(']]')[0]
- interwikiList.append(nome)
- lang = nome.split(':')[0]
- newtext = wikipedia.removeLanguageLinks(newtext, site)
- interwikiList.sort()
- newtext += "\n%s" % addText
- for paginetta in categorieInside:
- try:
- newtext += '\n[[%s]]' % paginetta.decode('utf-8')
- except UnicodeEncodeError:
- try:
- newtext += '\n[[%s]]' % paginetta.decode('Latin-1')
- except UnicodeEncodeError:
- newtext += '\n[[%s]]' % paginetta
- newtext += '\n'
- starsListInPage = list()
- for star in starsList:
- regex = re.compile('(\{\{(?:template:|)%s\|.*?\}\}\n)' % star, re.I)
- risultato = regex.findall(newtext)
- if risultato != []:
- newtext = regex.sub('', newtext)
- for element in risultato:
- newtext += '\n%s' % element
- for paginetta in interwikiList:
- try:
- newtext += '\n[[%s]]' % paginetta.decode('utf-8')
- except UnicodeEncodeError:
- try:
- newtext += '\n[[%s]]' % paginetta.decode('Latin-1')
- except UnicodeEncodeError:
- newtext += '\n[[%s]]' % paginetta
- wikipedia.output(u"\n\n>>> \03{lightpurple}%s\03{default} <<<" % page.title())
- wikipedia.showDiff(text, newtext)
- while 1:
- if not always:
- choice = wikipedia.inputChoice(u'Do you want to accept these changes?', ['Yes', 'No', 'All'], ['y', 'N', 'a'], 'N')
- if choice.lower() in ['a', 'all']:
- always = True
- if choice.lower() in ['n', 'no']:
- break
- if choice.lower() in ['y', 'yes'] or always:
- try:
- page.put(newtext, summary)
- except wikipedia.EditConflict:
- wikipedia.output(u'Edit conflict! skip!')
- break
- except wikipedia.ServerError:
- errorCount += 1
- if errorCount < 5:
- wikipedia.output(u'Server Error! Wait..')
- time.sleep(3)
- continue
- else:
- raise wikipedia.ServerError(u'Fifth Server Error!')
- except wikipedia.SpamfilterError, e:
- wikipedia.output(u'Cannot change %s because of blacklist entry %s' % (page.title(), e.url))
- break
- except wikipedia.PageNotSaved, error:
- wikipedia.output(u'Error putting page: %s' % (error.args,))
- break
- except wikipedia.LockedPage:
- wikipedia.output(u'Skipping %s (locked page)' % (page.title(),))
- break
- else:
- # Break only if the errors are one after the other...
- errorCount = 0
- break
-if __name__ == "__main__":
- try:
- main()
- finally:
- wikipedia.stopme()
Copied: trunk/pywikipedia/add_text.py (from rev 4603, trunk/pywikipedia/AddText.py)
===================================================================
--- trunk/pywikipedia/add_text.py (rev 0)
+++ trunk/pywikipedia/add_text.py 2007-11-27 16:20:41 UTC (rev 4604)
@@ -0,0 +1,284 @@
+#!/usr/bin/python
+# -*- coding: utf-8 -*-
+"""
+This is a Bot written by Filnik to add a text in a given category.
+
+--- GenFactory Generator is used ---
+-start Define from which page should the Bot start
+-ref Use the ref as generator
+-cat Use a category as generator
+-filelinks Use all the links to an image as generator
+-unusedfiles
+-unwatched
+-withoutinterwiki
+-interwiki
+-file
+-uncatfiles
+-uncatcat
+-uncat
+-subcat
+-transcludes Use all the page that transclude a certain page as generator
+-weblink Use the pages with a certain web link as generator
+-links Use the links from a certain page as generator
+-regex Only work on pages whose titles match the given regex
+
+--- Other parameters ---
+-page Use a page as generator
+-text Define which text add
+-summary Define the summary to use
+-except Use a regex to understand if the template is already in the page
+-excepturl Use the html page as text where you want to see if there's the text, not the wiki-page.
+-newimages Add text in the new images
+-untagged Add text in the images that doesn't have any license template
+-always If used, the bot won't asked if it should add the text specified
+"""
+
+#
+# (C) Filnik, 2007
+#
+# Distributed under the terms of the MIT license.
+#
+__version__ = '$Id: AddText.py,v 1.0 2007/11/27 17:08:30 filnik Exp$'
+#
+
+import re, pagegenerators, urllib2, urllib
+import wikipedia, catlib
+
+class NoEnoughData(wikipedia.Error):
+ """ Error class for when the user doesn't specified all the data needed """
+
+class NothingFound(wikipedia.Error):
+ """ An exception indicating that a regex has return [] instead of results."""
+
+def pageText(url):
+ try:
+ request = urllib2.Request(url)
+ user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7'
+ request.add_header("User-Agent", user_agent)
+ response = urllib2.urlopen(request)
+ text = response.read()
+ response.close()
+ # When you load to many users, urllib2 can give this error.
+ except urllib2.HTTPError:
+ wikipedia.output(u"Server error. Pausing for 10 seconds... " + time.strftime("%d %b %Y %H:%M:%S (UTC)", time.gmtime()) )
+ time.sleep(10)
+ request = urllib2.Request(url)
+ user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7'
+ request.add_header("User-Agent", user_agent)
+ response = urllib2.urlopen(request)
+ text = response.read()
+ response.close()
+ return text
+
+def untaggedGenerator(untaggedProject, limit = 500):
+ lang = untaggedProject.split('.', 1)[0]
+ project = '.' + untaggedProject.split('.', 1)[1]
+ if lang == 'commons':
+ link = 'http://tools.wikimedia.de/~daniel/WikiSense/UntaggedImages.php?wikifam=comm…'
+ else:
+ link = 'http://tools.wikimedia.de/~daniel/WikiSense/UntaggedImages.php?wikilang=' + lang + '&wikifam=' + project + '&order=img_timestamp&max=' + str(limit) + '&ofs=0&max=' + str(limit)
+ text = pageText(link)
+ #print text
+ regexp = r"""<td valign='top' title='Name'><a href='http://.*?\..*?\.org/w/index\.php\?title=(.*?)'>.*?</a></td>"""
+ results = re.findall(regexp, text)
+ if results == []:
+ print link
+ raise NothingFound('Nothing found! Try to use the tool by yourself to be sure that it works!')
+ else:
+ for result in results:
+ yield wikipedia.Page(self.site, result)
+
+def newImages(limit):
+ # Search regular expression to find links like this (and the class attribute is optional too)
+ # class="new" title="Immagine:Soldatino2.jpg">Immagine:Soldatino2.jpg</a>" <span class="comment">
+ url = "/w/index.php?title=Special:Log&type=upload&user=&page=&pattern=&limit=%d&offset=0" % int(limit)
+ site = wikipedia.getSite()
+ textrun = site.getUrl(url)
+ image_namespace = site.image_namespace() + ":"
+ regexp = r'(class=\"new\" |)title=\"' + image_namespace + '(.*?)\.(\w\w\w|jpeg)\">.*?</a>\".*?<span class=\"comment\">'
+ pos = 0
+ done = list()
+ ext_list = list()
+ r = re.compile(regexp, re.UNICODE)
+ while 1:
+ m = r.search(textrun, pos)
+ if m == None:
+ wikipedia.output(u"\t\t>> All images checked. <<")
+ break
+ pos = m.end()
+ new = m.group(1)
+ im = m.group(2)
+ ext = m.group(3)
+ # This prevent pages with strange characters. They will be loaded without problem.
+ image = im + "." + ext
+ if new != '':
+ wikipedia.output(u"Skipping %s because it has been deleted." % image)
+ done.append(image)
+ if image not in done:
+ done.append(image)
+ yield wikipedia.Page(site, 'Image:%s' % image)
+
+def main():
+ starsList = ['link[ _]fa', 'link[ _]adq', 'enllaç[ _]ad',
+ 'link[ _]ua', 'legătură[ _]af', 'destacado',
+ 'ua', 'liên k[ _]t[ _]chọn[ _]lọc']
+ summary = None
+ addText = None
+ regexSkip = None
+ always = False
+ exceptUrl = False
+ genFactory = pagegenerators.GeneratorFactory()
+ errorCount = 0
+
+ for arg in wikipedia.handleArgs():
+ if arg.startswith('-text'):
+ if len(arg) == 5:
+ addText = wikipedia.input(u'What text do you want to add?')
+ else:
+ addText = arg[6:]
+ elif arg.startswith('-summary'):
+ if len(arg) == 8:
+ summary = wikipedia.input(u'What summary do you want to use?')
+ else:
+ summary = arg[9:]
+ elif arg.startswith('-page'):
+ if len(arg) == 5:
+ generator = list(wikipedia.input(u'What page do you want to use?'))
+ else:
+ generator = listr(arg[6:])
+ elif arg.startswith('-excepturl'):
+ exceptUrl = True
+ if len(arg) == 10:
+ regexSkip = wikipedia.input(u'What text should I skip?')
+ else:
+ regexSkip = arg[11:]
+ elif arg.startswith('-except'):
+ if len(arg) == 7:
+ regexSkip = wikipedia.input(u'What text should I skip?')
+ else:
+ regexSkip = arg[8:]
+ elif arg.startswith('-untagged'):
+ if len(arg) == 9:
+ untaggedProject = wikipedia.input(u'What project do you want to use?')
+ else:
+ untaggedProject = arg[10:]
+ generator = untaggedGenerator(untaggedProject)
+ elif arg.startswith('-newimages'):
+ if len(arg) == 10:
+ limit = wikipedia.input(u'How many images do you want to check?')
+ else:
+ limit = arg[11:]
+ generator = newImages(limit)
+ elif arg == '-always':
+ always = True
+ else:
+ generator = genFactory.handleArg(arg)
+
+ site = wikipedia.getSite()
+ pathWiki = site.family.nicepath(site.lang)
+ if not generator:
+ raise NoEnoughData('You have to specify the generator you want to use for the script!')
+ if not addText:
+ raise NoEnoughData('You have to specify what text you want to add!')
+ if not summary:
+ summary = 'Bot: Adding %s' % addText
+ for page in generator:
+ wikipedia.output(u'Loading %s...' % page.title())
+ try:
+ text = page.get()
+ except wikipedia.NoPage:
+ wikipedia.output(u"%s doesn't exist, skip!" % page.title())
+ continue
+ except wikipedia.IsRedirectPage:
+ wikipedia.output(u"%s is a redirect, skip!" % page.title())
+ continue
+ if regexSkip and exceptUrl:
+ url = '%s%s' % (pathWiki, page.urlname())
+ result = re.findall(regexSkip, site.getUrl(url))
+ elif regexSkip:
+ result = re.findall(regexSkip, text)
+ else:
+ result = []
+ if result != []:
+ wikipedia.output(u'Exception! regex (or word) use with -except, is in the page. Skip!')
+ continue
+ newtext = text
+ categoryNamespace = site.namespace(14)
+ regexpCat = re.compile(r'\[\[((?:category|%s):.*?)\]\]' % categoryNamespace.lower(), re.I)
+ categorieInside = regexpCat.findall(text)
+ newtext = wikipedia.removeCategoryLinks(newtext, site)
+ interwikiInside = page.interwiki()
+ interwikiList = list()
+ for paginetta in interwikiInside:
+ nome = str(paginetta).split('[[')[1].split(']]')[0]
+ interwikiList.append(nome)
+ lang = nome.split(':')[0]
+ newtext = wikipedia.removeLanguageLinks(newtext, site)
+ interwikiList.sort()
+ newtext += "\n%s" % addText
+ for paginetta in categorieInside:
+ try:
+ newtext += '\n[[%s]]' % paginetta.decode('utf-8')
+ except UnicodeEncodeError:
+ try:
+ newtext += '\n[[%s]]' % paginetta.decode('Latin-1')
+ except UnicodeEncodeError:
+ newtext += '\n[[%s]]' % paginetta
+ newtext += '\n'
+ starsListInPage = list()
+ for star in starsList:
+ regex = re.compile('(\{\{(?:template:|)%s\|.*?\}\}\n)' % star, re.I)
+ risultato = regex.findall(newtext)
+ if risultato != []:
+ newtext = regex.sub('', newtext)
+ for element in risultato:
+ newtext += '\n%s' % element
+ for paginetta in interwikiList:
+ try:
+ newtext += '\n[[%s]]' % paginetta.decode('utf-8')
+ except UnicodeEncodeError:
+ try:
+ newtext += '\n[[%s]]' % paginetta.decode('Latin-1')
+ except UnicodeEncodeError:
+ newtext += '\n[[%s]]' % paginetta
+ wikipedia.output(u"\n\n>>> \03{lightpurple}%s\03{default} <<<" % page.title())
+ wikipedia.showDiff(text, newtext)
+ while 1:
+ if not always:
+ choice = wikipedia.inputChoice(u'Do you want to accept these changes?', ['Yes', 'No', 'All'], ['y', 'N', 'a'], 'N')
+ if choice.lower() in ['a', 'all']:
+ always = True
+ if choice.lower() in ['n', 'no']:
+ break
+ if choice.lower() in ['y', 'yes'] or always:
+ try:
+ page.put(newtext, summary)
+ except wikipedia.EditConflict:
+ wikipedia.output(u'Edit conflict! skip!')
+ break
+ except wikipedia.ServerError:
+ errorCount += 1
+ if errorCount < 5:
+ wikipedia.output(u'Server Error! Wait..')
+ time.sleep(3)
+ continue
+ else:
+ raise wikipedia.ServerError(u'Fifth Server Error!')
+ except wikipedia.SpamfilterError, e:
+ wikipedia.output(u'Cannot change %s because of blacklist entry %s' % (page.title(), e.url))
+ break
+ except wikipedia.PageNotSaved, error:
+ wikipedia.output(u'Error putting page: %s' % (error.args,))
+ break
+ except wikipedia.LockedPage:
+ wikipedia.output(u'Skipping %s (locked page)' % (page.title(),))
+ break
+ else:
+ # Break only if the errors are one after the other...
+ errorCount = 0
+ break
+if __name__ == "__main__":
+ try:
+ main()
+ finally:
+ wikipedia.stopme()