jenkins-bot has submitted this change and it was merged.
Change subject: Porting various Commons upload bots to core from compat.
......................................................................
Porting various Commons upload bots to core from compat.
Added scripts:
1) imagecopy.py
2) imagecopy_self.py
3) imageharvest.py
4) panoramiopicker.py
Made change in pywikibot/config2.py so as to support functions of above scripts.
Bug: T66856
Change-Id: I5ea3a2131badba22fdc5e99deb5c40a49f4f0998
---
M pywikibot/config2.py
A scripts/imagecopy.py
A scripts/imagecopy_self.py
A scripts/imageharvest.py
A scripts/panoramiopicker.py
M tests/script_tests.py
6 files changed, 2,119 insertions(+), 0 deletions(-)
Approvals:
John Vandenberg: Looks good to me, approved
jenkins-bot: Verified
diff --git a/pywikibot/config2.py b/pywikibot/config2.py
index c5ad2c1..4cac315 100644
--- a/pywikibot/config2.py
+++ b/pywikibot/config2.py
@@ -655,6 +655,13 @@
'reviewer': u'', # If so, under what reviewer name?
}
+# Using the Panoramio api
+panoramio = {
+ 'review': False, # Do we use automatically make our uploads reviewed?
+ 'reviewer': u'', # If so, under what reviewer name?
+}
+
+
# ############# COPYRIGHT SETTINGS ##############
# Enable/disable search engine in copyright.py script
diff --git a/scripts/imagecopy.py b/scripts/imagecopy.py
new file mode 100644
index 0000000..0450fc7
--- /dev/null
+++ b/scripts/imagecopy.py
@@ -0,0 +1,563 @@
+# -*- coding: utf-8 -*-
+"""
+Script to copy files from a local Wikimedia wiki to Wikimedia Commons.
+
+It uses CommonsHelper to not leave any information out and CommonSense
+to automatically categorise the file. After copying, a NowCommons
+template is added to the local wiki's file. It uses a local exclusion
+list to skip files with templates not allow on Wikimedia Commons. If no
+categories have been found, the file will be tagged on Commons.
+
+This bot uses a graphical interface and may not work from commandline
+only environment.
+
+Requests for improvement for CommonsHelper output should be directed to
+Magnus Manske at his talk page. Please be very specific in your request
+(describe current output and expected output) and note an example file,
+so he can test at: [[de:Benutzer Diskussion:Magnus Manske]]. You can
+write him in German and English.
+
+Examples
+
+Work on a single image
+ python pwb.py imagecopy.py -page:Image:<imagename>
+Work on the 100 newest images:
+ python pwb.py imagecopy.py -newimages:100
+Work on all images in a category:<cat>
+ python pwb.py imagecopy.py -cat:<cat>
+Work on all images which transclude a template
+ python pwb.py imagecopy.py -transcludes:<template>
+
+See pagegenerators.py for more ways to get a list of images.
+By default the bot works on your home wiki (set in user-config)
+
+Known issues/FIXMEs (no critical issues known):
+* make it use pagegenerators.py
+** Implemented in rewrite
+* Some variable names are in Spanish, which makes the code harder to read.
+** Almost all variables are now in English
+* Depending on sorting within a file category, the "next batch" is sometimes
+ not working, leading to an endless loop
+** Using pagegenerators now
+* Different wikis can have different exclusion lists. A parameter for the
+ exclusion list Uploadbot.localskips.txt would probably be nice.
+* Bot should probably use API instead of query.php
+** Api? Query? Wikipedia.py!
+* Should request alternative name if file name already exists on Commons
+** Implemented in rewrite
+* Exits after last file in category was processed, aborting all pending
+ threads.
+** Implemented proper threading in rewrite
+* Should take user-config.py as input for project and lang variables
+** Implemented in rewrite
+* Should require a Commons user to be present in user-config.py before
+ working
+* Should probably have an input field for additional categories
+* Should probably have an option to change uploadtext with file
+* required i18n options for NowCommons template (f.e. {{subst:ncd}} on
+ en.wp. Currently needs customisation to work properly. Bot was tested
+ succesfully on nl.wp (12k+ files copied and deleted locally) and en.wp
+ (about 100 files copied and SieBot has bot approval for tagging {{ncd}}
+ with this bot)
+** Implemented
+* {{NowCommons|xxx}} requires the namespace prefix Image: on most wikis
+ and can be left out on others. This needs to be taken care of when
+ implementing i18n
+** Implemented
+* This bot should probably get a small tutorial at meta with a few
+ screenshots.
+"""
+#
+# Based on upload.py by:
+# (C) Rob W.W. Hooft, Andre Engels 2003-2007
+# (C) Wikipedian, Keichwa, Leogregianin, Rikwade, Misza13 2003-2007
+#
+# New bot by:
+# (C) Kyle/Orgullomoore, Siebrand Mazeland 2007-2008
+#
+# Another rewrite by:
+# (C) Multichill 2008-2011
+# (C) Pywikibot team, 2007-2015
+#
+# Distributed under the terms of the MIT license.
+#
+from __future__ import absolute_import, unicode_literals
+
+__version__ = '$Id$'
+
+import codecs
+import re
+import socket
+import threading
+import webbrowser
+
+import pywikibot
+
+from pywikibot import pagegenerators, config, i18n
+
+from pywikibot.tools import PY2
+
+from scripts import image, upload
+
+if not PY2:
+ import tkinter as Tkinter
+
+ from urllib.parse import urlencode
+ from urllib.request import urlopen
+else:
+ import Tkinter
+
+ from urllib import urlencode, urlopen
+
+try:
+ from pywikibot.userinterfaces.gui import Tkdialog
+except ImportError as _tk_error:
+ Tkdialog = None
+
+NL = ''
+
+nowCommonsTemplate = {
+ '_default': u'{{NowCommons|%s}}',
+ 'af': u'{{NowCommons|File:%s}}',
+ 'als': u'{{NowCommons|%s}}',
+ 'am': u'{{NowCommons|File:%s}}',
+ 'ang': u'{{NowCommons|File:%s}}',
+ 'ar': u'{{الآن كومنز|%s}}',
+ 'ast': u'{{EnCommons|File:%s}}',
+ 'az': u'{{NowCommons|%s}}',
+ 'bar': u'{{NowCommons|%s}}',
+ 'bg': u'{{NowCommons|%s}}',
+ 'bn': u'{{NowCommons|File:%s}}',
+ 'bs': u'{{NowCommons|%s}}',
+ 'ca': u'{{AraCommons|%s}}',
+ 'cs': u'{{NowCommons|%s}}',
+ 'cy': u'{{NowCommons|File:%s}}',
+ 'da': u'{{NowCommons|File:%s}}',
+ 'de': u'{{NowCommons|%s}}',
+ 'dsb': u'{{NowCommons|%s}}',
+ 'el': u'{{NowCommons|%s}}',
+ 'en': u'{{subst:ncd|%s}}',
+ 'eo': u'{{Nun en komunejo|%s}}',
+ 'es': u'{{EnCommons|File:%s}}',
+ 'et': u'{{NüüdCommonsis|File:%s}}',
+ 'fa': u'{{NowCommons|%s}}',
+ 'fi': u'{{NowCommons|%s}}',
+ 'fo': u'{{NowCommons|File:%s}}',
+ 'fr': u'{{Image sur Commons|%s}}',
+ 'fy': u'{{NowCommons|%s}}',
+ 'ga': u'{{Ag Cómhaoin|File:%s}}',
+ 'gl': u'{{EnCommons]|File:%s}}',
+ 'gv': u'{{NowCommons|File:%s}}',
+ 'he': u'{{גם בוויקישיתוף|%s}}',
+ 'hr': u'{{NowCommons|%s}}',
+ 'hsb': u'{{NowCommons|%s}}',
+ 'hu': u'{{Azonnali-commons|%s}}',
+ 'ia': u'{{NowCommons|File:%s}}',
+ 'id': u'{{NowCommons|File:%s}}',
+ 'ilo': u'{{NowCommons|File:%s}}',
+ 'io': u'{{NowCommons|%s}}',
+ 'is': u'{{NowCommons|%s}}',
+ 'it': u'{{NowCommons|%s}}',
+ 'ja': u'{{NowCommons|File:%s}}',
+ 'jv': u'{{NowCommons|File:%s}}',
+ 'ka': u'{{NowCommons|File:%s}}',
+ 'kn': u'{{NowCommons|File:%s}}',
+ 'ko': u'{{NowCommons|File:%s}}',
+ 'ku': u'{{NowCommons|%s}}',
+ 'lb': u'{{Elo op Commons|%s}}',
+ 'li': u'{{NowCommons|%s}}',
+ 'lt': u'{{NowCommons|File:%s}}',
+ 'lv': u'{{NowCommons|File:%s}}',
+ 'mk': u'{{NowCommons|File:%s}}',
+ 'mn': u'{{NowCommons|File:%s}}',
+ 'ms': u'{{NowCommons|%s}}',
+ 'nds-nl': u'{{NoenCommons|File:%s}}',
+ 'nl': u'{{NuCommons|%s}}',
+ 'nn': u'{{No på Commons|File:%s}}',
+ 'no': u'{{NowCommons|%s}}',
+ 'oc': u'{{NowCommons|File:%s}}',
+ 'pl': u'{{NowCommons|%s}}',
+ 'pt': u'{{NowCommons|%s}}',
+ 'ro': u'{{AcumCommons|File:%s}}',
+ 'ru': u'{{Перенесено на Викисклад|%s}}',
+ 'sa': u'{{NowCommons|File:%s}}',
+ 'scn': u'{{NowCommons|File:%s}}',
+ 'sh': u'{{NowCommons|File:%s}}',
+ 'sk': u'{{NowCommons|File:%s}}',
+ 'sl': u'{{OdslejZbirka|%s}}',
+ 'sq': u'{{NowCommons|File:%s}}',
+ 'sr': u'{{NowCommons|File:%s}}',
+ 'st': u'{{NowCommons|File:%s}}',
+ 'su': u'{{IlaharKiwari|File:%s}}',
+ 'sv': u'{{NowCommons|%s}}',
+ 'sw': u'{{NowCommons|%s}}',
+ 'ta': u'{{NowCommons|File:%s}}',
+ 'th': u'{{มีที่คอมมอนส์|File:%s}}',
+ 'tl': u'{{NasaCommons|File:%s}}',
+ 'tr': u'{{NowCommons|%s}}',
+ 'uk': u'{{NowCommons|File:%s}}',
+ 'ur': u'{{NowCommons|File:%s}}',
+ 'vec': u'{{NowCommons|%s}}',
+ 'vi': u'{{NowCommons|File:%s}}',
+ 'vo': u'{{InKobädikos|%s}}',
+ 'wa': u'{{NowCommons|%s}}',
+ 'zh': u'{{NowCommons|File:%s}}',
+ 'zh-min-nan': u'{{Commons ū|%s}}',
+ 'zh-yue': u'{{subst:Ncd|File:%s}}',
+}
+
+moveToCommonsTemplate = {
+ 'ar': [u'نقل إلى كومنز'],
+ 'en': [u'Commons ok', u'Copy to Wikimedia Commons',
u'Move to commons',
+ u'Movetocommons', u'To commons',
+ u'Copy to Wikimedia Commons by BotMultichill'],
+ 'fi': [u'Commonsiin'],
+ 'fr': [u'Image pour Commons'],
+ 'hsb': [u'Kopěruj do Wikimedia Commons'],
+ 'hu': [u'Commonsba'],
+ 'is': [u'Færa á Commons'],
+ 'ms': [u'Hantar ke Wikimedia Commons'],
+ 'nl': [u'Verplaats naar Wikimedia Commons', u'VNC'],
+ 'pl': [u'Do Commons', u'NaCommons', u'Na Commons'],
+ 'ru': [u'На Викисклад'],
+ 'sl': [u'Skopiraj v Zbirko'],
+ 'sr': [u'За оставу'],
+ 'sv': [u'Till Commons'],
+ 'zh': [u'Copy to Wikimedia Commons'],
+}
+
+
+def pageTextPost(url, parameters):
+ gotInfo = False
+ while not gotInfo:
+ try:
+ commonsHelperPage = urlopen(
+ "http://tools.wmflabs.org/commonshelper/index.php",
parameters)
+ data = commonsHelperPage.read().decode('utf-8')
+ gotInfo = True
+ except IOError:
+ pywikibot.output(u'Got an IOError, let\'s try again')
+ except socket.timeout:
+ pywikibot.output(u'Got a timeout, let\'s try again')
+ return data
+
+
+class imageTransfer(threading.Thread):
+
+ """Facilitate transfer of image/file to commons."""
+
+ def __init__(self, imagePage, newname, category):
+ self.imagePage = imagePage
+ self.newname = newname
+ self.category = category
+ threading.Thread.__init__(self)
+
+ def run(self):
+ tosend = {'language':
self.imagePage.site.language().encode('utf-8'),
+ 'image':
self.imagePage.title(withNamespace=False).encode('utf-8'),
+ 'newname': self.newname.encode('utf-8'),
+ 'project':
self.imagePage.site.family.name.encode('utf-8'),
+ 'username': '',
+ 'commonsense': '1',
+ 'remove_categories': '1',
+ 'ignorewarnings': '1',
+ 'doit': 'Uitvoeren'
+ }
+
+ tosend = urlencode(tosend)
+ pywikibot.output(tosend)
+ CH =
pageTextPost('http://tools.wmflabs.org/commonshelper/index.php'php',
+ tosend)
+ pywikibot.output('Got CH desc.')
+
+ tablock = CH.split('<textarea ')[1].split('>')[0]
+ CH = CH.split('<textarea ' + tablock +
'>')[1].split('</textarea>')[0]
+ CH = CH.replace(u'×', u'×')
+ CH = self.fixAuthor(CH)
+ pywikibot.output(CH)
+
+ # I want every picture to be tagged with the bottemplate so i can check my
contributions later.
+ CH = u'\n\n{{BotMoveToCommons|' + self.imagePage.site.language() + \
+ '.' + self.imagePage.site.family.name + \
+
'|year={{subst:CURRENTYEAR}}|month={{subst:CURRENTMONTHNAME}}|day={{subst:CURRENTDAY}}}}'
+ \
+ CH
+
+ if self.category:
+ CH = CH.replace(u'{{subst:Unc}} <!-- Remove this line once you have
added categories -->', u'')
+ CH += u'[[Category:' + self.category + u']]'
+
+ bot = upload.UploadRobot(url=self.imagePage.fileUrl(), description=CH,
+ useFilename=self.newname, keepFilename=True,
+ verifyDescription=False, ignoreWarning=True,
+ targetSite=pywikibot.Site('commons',
'commons'))
+ bot.run()
+
+ # Should check if the image actually was uploaded
+ if pywikibot.Page(pywikibot.Site('commons', 'commons'),
+ u'Image:' + self.newname).exists():
+ # Get a fresh copy, force to get the page so we dont run into edit
+ # conflicts
+ imtxt = self.imagePage.get(force=True)
+
+ # Remove the move to commons templates
+ if self.imagePage.site.language() in moveToCommonsTemplate:
+ for moveTemplate in
moveToCommonsTemplate[self.imagePage.site.language()]:
+ imtxt = re.sub(u'(?i)\{\{' + moveTemplate +
u'[^\}]*\}\}',
+ u'', imtxt)
+
+ # add {{NowCommons}}
+ if self.imagePage.site.language() in nowCommonsTemplate:
+ addTemplate = nowCommonsTemplate[self.imagePage.site.language()] %
self.newname
+ else:
+ addTemplate = nowCommonsTemplate['_default'] % self.newname
+
+ commentText = i18n.twtranslate(self.imagePage.site,
+ 'commons-file-now-available',
+ {'localfile':
self.imagePage.title(withNamespace=False),
+ 'commonsfile': self.newname})
+
+ pywikibot.showDiff(self.imagePage.get(), imtxt + addTemplate)
+ self.imagePage.put(imtxt + addTemplate, comment=commentText)
+
+ self.gen = pagegenerators.FileLinksGenerator(self.imagePage)
+ self.preloadingGen = pagegenerators.PreloadingGenerator(self.gen)
+
+ # If the image is uploaded under a different name, replace all instances
+ if self.imagePage.title(withNamespace=False) != self.newname:
+ moveSummary = i18n.twtranslate(self.imagePage.site,
+ 'commons-file-moved',
+ {'localfile':
self.imagePage.title(withNamespace=False),
+ 'commonsfile': self.newname})
+
+ imagebot = image.ImageRobot(generator=self.preloadingGen,
+
oldImage=self.imagePage.title(withNamespace=False),
+ newImage=self.newname,
+ summary=moveSummary, always=True,
+ loose=True)
+ imagebot.run()
+ return
+
+ def fixAuthor(self, pageText):
+ """Fix the author field in the information
template."""
+ informationRegex = re.compile(
+ u'\|Author\=Original uploader was
(?P<author>\[\[:\w+:\w+:\w+\|\w+\]\] at \[.+\])')
+ selfRegex = re.compile(
+ u'\{\{self\|author\=(?P<author>\[\[:\w+:\w+:\w+\|\w+\]\] at
\[.+\])\|')
+
+ # Find the |Author=Original uploader was ....
+ informationMatch = informationRegex.search(pageText)
+
+ # Find the {{self|author=
+ selfMatch = selfRegex.search(pageText)
+
+ # Check if both are found and are equal
+ if (informationMatch and selfMatch):
+ if(informationMatch.group('author') ==
selfMatch.group('author')):
+ # Replace |Author=Original uploader was ... with |Author= ...
+ pageText = informationRegex.sub(r'|Author=\g<author>',
pageText)
+ return pageText
+
+
+# -label ok skip view
+# textarea
+archivo = config.datafilepath("Uploadbot.localskips.txt")
+try:
+ open(archivo, 'r')
+except IOError:
+ tocreate = open(archivo, 'w')
+ tocreate.write("{{NowCommons")
+ tocreate.close()
+
+
+def getautoskip():
+ """Get a list of templates to skip."""
+ f = codecs.open(archivo, 'r', 'utf-8')
+ txt = f.read()
+ f.close()
+ toreturn = txt.split('{{')[1:]
+ return toreturn
+
+
+class TkdialogIC(Tkdialog):
+
+ """The dialog window for image info."""
+
+ def __init__(self, image_title, content, uploader, url, templates,
+ commonsconflict=0):
+ super(TkdialogIC, self).__init__()
+ self.root = Tkinter.Tk()
+ # "%dx%d%+d%+d" % (width, height, xoffset, yoffset)
+ # Always appear the same size and in the bottom-left corner
+ self.root.geometry("600x200+100-100")
+ self.root.title(image_title)
+ self.changename = ''
+ self.skip = 0
+ self.url = url
+ self.uploader = "Unknown"
+ # uploader.decode('utf-8')
+ scrollbar = Tkinter.Scrollbar(self.root, orient=Tkinter.VERTICAL)
+ label = Tkinter.Label(self.root, text=u"Enter new name or leave
blank.")
+ imageinfo = Tkinter.Label(self.root, text='Uploaded by %s.' % uploader)
+ textarea = Tkinter.Text(self.root)
+ textarea.insert(Tkinter.END, content.encode('utf-8'))
+ textarea.config(state=Tkinter.DISABLED, height=8, width=40, padx=0, pady=0,
+ wrap=Tkinter.WORD, yscrollcommand=scrollbar.set)
+ scrollbar.config(command=textarea.yview)
+ self.entry = Tkinter.Entry(self.root)
+
+ self.templatelist = Tkinter.Listbox(self.root, bg="white", height=5)
+
+ for template in templates:
+ self.templatelist.insert(Tkinter.END, template)
+ autoskip_button = Tkinter.Button(self.root, text="Add to AutoSkip",
+ command=self.add2_auto_skip)
+ browser_button = Tkinter.Button(self.root, text='View in browser',
+ command=self.open_in_browser)
+ skip_button = Tkinter.Button(self.root, text="Skip",
command=self.skip_file)
+ ok_button = Tkinter.Button(self.root, text="OK", command=self.ok_file)
+
+ # Start grid
+ label.grid(row=0)
+ ok_button.grid(row=0, column=1, rowspan=2)
+ skip_button.grid(row=0, column=2, rowspan=2)
+ browser_button.grid(row=0, column=3, rowspan=2)
+
+ self.entry.grid(row=1)
+
+ textarea.grid(row=2, column=1, columnspan=3)
+ scrollbar.grid(row=2, column=5)
+ self.templatelist.grid(row=2, column=0)
+
+ autoskip_button.grid(row=3, column=0)
+ imageinfo.grid(row=3, column=1, columnspan=4)
+
+ def ok_file(self):
+ """The user pressed the OK button."""
+ self.changename = self.entry.get()
+ self.root.destroy()
+
+ def getnewname(self):
+ """Activate the dialog and return the new name and if the image is
skipped."""
+ self.root.mainloop()
+ return (self.changename, self.skip)
+
+ def open_in_browser(self):
+ """The user pressed the View in browser button."""
+ webbrowser.open(self.url)
+
+ def add2_auto_skip(self):
+ """The user pressed the Add to AutoSkip button."""
+ templateid = int(self.templatelist.curselection()[0])
+ template = self.templatelist.get(templateid)
+ with codecs.open(archivo, 'a', 'utf-8') as f:
+ f.write('{{' + template)
+ self.skip_file()
+
+
+def doiskip(pagetext):
+ """Skip this image or not.
+
+ Returns True if the image is on the skip list, otherwise False
+ """
+ saltos = getautoskip()
+ # print saltos
+ for salto in saltos:
+ rex = u'\{\{\s*[' + salto[0].upper() + salto[0].lower() + ']' +
\
+ salto[1:] + '(\}\}|\|)'
+ # print rex
+ if re.search(rex, pagetext):
+ return True
+ return False
+
+
+def main(*args):
+ """Process command line arguments and invoke bot."""
+ generator = None
+ imagepage = None
+ always = False
+ category = u''
+ # Load a lot of default generators
+ local_args = pywikibot.handle_args(args)
+ genFactory = pagegenerators.GeneratorFactory()
+
+ for arg in local_args:
+ if arg == '-always':
+ always = True
+ elif arg.startswith('-cc:'):
+ category = arg[len('-cc:'):]
+ else:
+ genFactory.handleArg(arg)
+
+ generator = genFactory.getCombinedGenerator()
+ if not generator:
+ pywikibot.bot.suggest_help(missing_generator=True)
+ return False
+
+ pregenerator = pagegenerators.PreloadingGenerator(generator)
+
+ for page in pregenerator:
+ skip = False
+ if page.exists() and (page.namespace() == 6) and (
+ not page.isRedirectPage()):
+ imagepage = pywikibot.FilePage(page.site(), page.title())
+
+ # First do autoskip.
+ if doiskip(imagepage.get()):
+ pywikibot.output("Skipping " + page.title())
+ skip = True
+ else:
+ # The first upload is last in the list.
+ try:
+ username = imagepage.getLatestUploader()[0]
+ except NotImplementedError:
+ # No API, using the page file instead
+ (datetime, username, resolution, size,
+ comment) = imagepage.getFileVersionHistory().pop()
+ if always:
+ newname = imagepage.title(withNamespace=False)
+ CommonsPage = pywikibot.Page(pywikibot.Site('commons',
+ 'commons'),
+ u'File:%s' % newname)
+ if CommonsPage.exists():
+ skip = True
+ else:
+ while True:
+ # Do the TkdialogIC to accept/reject and change te name
+ (newname, skip) = TkdialogIC(
+ imagepage.title(withNamespace=False),
+ imagepage.get(), username, imagepage.permalink(),
+ imagepage.templates()).getnewname()
+
+ if skip:
+ pywikibot.output('Skipping this image')
+ break
+
+ # Did we enter a new name?
+ if len(newname) == 0:
+ # Take the old name
+ newname = imagepage.title(withNamespace=False)
+ else:
+ newname = newname.decode('utf-8')
+
+ # Check if the image already exists
+ CommonsPage = pywikibot.Page(
+ pywikibot.Site('commons', 'commons'),
+ u'File:' + newname)
+ if not CommonsPage.exists():
+ break
+ else:
+ pywikibot.output('Image already exists, pick another name
or skip this image')
+ # We dont overwrite images, pick another name, go to the start of
the loop
+
+ if not skip:
+ imageTransfer(imagepage, newname, category).start()
+
+ pywikibot.output(u'Still ' + str(threading.activeCount()) + u' active
threads, lets wait')
+ for openthread in threading.enumerate():
+ if openthread != threading.currentThread():
+ openthread.join()
+ pywikibot.output(u'All threads are done')
+
+
+if __name__ == "__main__":
+ main()
diff --git a/scripts/imagecopy_self.py b/scripts/imagecopy_self.py
new file mode 100644
index 0000000..73f458f
--- /dev/null
+++ b/scripts/imagecopy_self.py
@@ -0,0 +1,1007 @@
+# -*- coding: utf-8 -*-
+"""
+Script to copy self published files from English Wikipedia to Wikimedia Commons.
+
+This bot is based on imagecopy.py and intended to be used to empty out
+http://en.wikipedia.org/wiki/Category:Self-published_work
+
+This bot uses a graphical interface and may not work from commandline
+only environment.
+
+Examples
+
+Work on a single file
+ python pwb.py imagecopy.py -page:file:<filename>
+Work on all images in a category:<cat>
+ python pwb.py imagecopy.py -cat:<cat>
+Work on all images which transclude a template
+ python pwb.py imagecopy.py -transcludes:<template>
+
+See pagegenerators.py for more ways to get a list of images.
+By default the bot works on your home wiki (set in user-config)
+
+This is a first test version and should be used with care.
+
+Use -nochecktemplate if you don't want to add the check template. Be sure to
+check it yourself.
+
+Todo:
+*Queues with threads have to be implemented for the information collecting part
+ and for the upload part.
+*Categories are now on a single line. Something like hotcat would be nice.
+
+"""
+#
+# Based on upload.py by:
+# (C) Rob W.W. Hooft, Andre Engels 2003-2007
+# (C) Wikipedian, Keichwa, Leogregianin, Rikwade, Misza13 2003-2007
+#
+# New bot by:
+# (C) Kyle/Orgullomoore, Siebrand Mazeland 2007
+#
+# Another rewrite by:
+# (C) Multichill 2008
+#
+# English Wikipedia specific bot by:
+# (C) Multichill 2010-2012
+#
+# (C) Pywikibot team, 2010-2015
+#
+# Distributed under the terms of the MIT license.
+#
+from __future__ import absolute_import, unicode_literals
+
+__version__ = '$Id$'
+
+import re
+import threading
+import webbrowser
+
+from datetime import datetime
+
+import pywikibot
+
+from pywikibot import pagegenerators, i18n
+
+from pywikibot.tools import PY2
+
+from scripts import imagerecat, image, upload
+
+if not PY2:
+ import tkinter as Tkinter
+ from queue import Queue
+else:
+ import Tkinter
+ from Queue import Queue
+
+try:
+ from pywikibot.userinterfaces.gui import Tkdialog
+except ImportError as _tk_error:
+ Tkdialog = None
+
+NL = ''
+
+nowCommonsTemplate = {
+ 'de': u'{{NowCommons|%s}}',
+ 'en':
u'{{NowCommons|1=File:%s|date=~~~~~|reviewer={{subst:REVISIONUSER}}}}',
+ 'lb': u'{{Elo op Commons|%s}}',
+ 'nds-nl': u'{{NoenCommons|1=File:%s}}',
+ 'shared':
u'{{NowCommons|1=File:%s|date=~~~~~|reviewer={{subst:REVISIONUSER}}}}',
+}
+
+moveToCommonsTemplate = {
+ 'de': [u'NowCommons', u'NC', u'NCT',
u'Nowcommons'],
+ 'en': [u'Commons ok', u'Copy to Wikimedia Commons',
u'Move to commons',
+ u'Movetocommons', u'To commons',
+ u'Copy to Wikimedia Commons by BotMultichill'],
+ 'lb': [u'Move to commons'],
+ 'nds-nl': [u'Noar Commons', u'VNC'],
+ 'shared': [u'Move'],
+}
+
+skipTemplates = {
+ 'de': [u'Löschprüfung',
+ u'NoCommons',
+ u'NowCommons',
+ u'NowCommons/Mängel',
+ u'NowCommons-Überprüft',
+ u'Wappenrecht',
+ ],
+ 'en': [u'Db-f1',
+ u'Db-f2',
+ u'Db-f3',
+ u'Db-f7',
+ u'Db-f8',
+ u'Db-f9',
+ u'Db-f10',
+ u'Do not move to Commons',
+ u'NowCommons',
+ u'CommonsNow',
+ u'Nowcommons',
+ u'NowCommonsThis',
+ u'Nowcommons2',
+ u'NCT',
+ u'Nowcommonsthis',
+ u'Moved to commons',
+ u'Now Commons',
+ u'Now at commons',
+ u'Db-nowcommons',
+ u'WikimediaCommons',
+ u'Now commons',
+ u'Di-no source',
+ u'Di-no license',
+ u'Di-no permission',
+ u'Di-orphaned fair use',
+ u'Di-no source no license',
+ u'Di-replaceable fair use',
+ u'Di-no fair use rationale',
+ u'Di-disputed fair use rationale',
+ u'Puf',
+ u'PUI',
+ u'Pui',
+ u'Ffd',
+ u'PD-user', # Only the self templates are supported for now.
+ u'Ticket Scan',
+ u'Non-free 2D art',
+ u'Non-free 3D art',
+ u'Non-free architectural work',
+ u'Non-free fair use in',
+ ],
+ 'lb': [u'Läschen',
+ ],
+ 'nds-nl': [u'Allinnig Wikipedie',
+ u'Bepark',
+ u'Gienidee',
+ u'NoenCommons',
+ u'NowCommons',
+ ],
+ 'shared': [u''],
+}
+
+
+licenseTemplates = {
+ 'de':
[(u'\{\{Bild-CC-by-sa/3\.0/de\}\}[\s\r\n]*\{\{Bild-CC-by-sa/3\.0\}\}[\s\r\n]*\{\{Bild-GFDL-Neu\}\}',
+
u'{{Self|Cc-by-sa-3.0-de|Cc-by-sa-3.0|GFDL|author=[[:%(lang)s:User:%(author)s|%(author)s]]
at [http://%(lang)s.%' +
+
u'(family)s.org %(lang)s.%(family)s]}}'),
+ (u'\{\{Bild-GFDL\}\}[\s\r\n]*\{\{Bild-CC-by-sa/(\d\.\d)\}\}',
+
u'{{Self|GFDL|Cc-by-sa-3.0-migrated|Cc-by-sa-\\1|author=[[:%(lang)s:User:%(author)s|%(author)s]]
at [http://%(lang' +
+
u')s.%(family)s.org %(lang)s.%(family)s]}}'),
+ (u'\{\{Bild-GFDL\}\}',
+
u'{{Self|GFDL|Cc-by-sa-3.0-migrated|author=[[:%(lang)s:User:%(author)s|%(author)s]] at
[http://%(lang)s.%(family)s' +
+
u'.org %(lang)s.%(family)s]}}'),
+ (u'\{\{Bild-CC-by-sa/(\d\.\d)\}\}',
+ u'{{Self|Cc-by-sa-\\1|author=[[:%(lang)s:User:%(author)s|%(author)s]] at
[http://%(lang)s.%(family)s.org %(lang)s.' +
+ u'%(family)s]}}'),
+ (u'\{\{Bild-CC-by-sa/(\d\.\d)/de\}\}',
+ u'{{Self|Cc-by-sa-\\1-de|author=[[:%(lang)s:User:%(author)s|%(author)s]]
at [http://%(lang)s.%(family)s.org' +
+ u' %(lang)s.%(family)s]}}'),
+ (u'\{\{Bild-CC-by/(\d\.\d)\}\}',
+ u'{{Self|Cc-by-\\1|author=[[:%(lang)s:User:%(author)s|%(author)s]] at
[http://%(lang)s.%(family)s.org' +
+ u' %(lang)s.%(family)s]}}'),
+ (u'\{\{Bild-CC-by/(\d\.\d)/de\}\}',
+ u'{{Self|Cc-by-\\1-de|author=[[:%(lang)s:User:%(author)s|%(author)s]] at
[http://%(lang)s.%(family)s.org' +
+ u' %(lang)s.%(family)s]}}'),
+ ],
+ 'en': [(u'\{\{(self|self2)\|([^\}]+)\}\}',
+ u'{{Self|\\2|author=[[:%(lang)s:User:%(author)s|%(author)s]] at
[http://%(lang)s.%(family)s.org' +
+ u' %(lang)s.%(family)s]}}'),
+ (u'\{\{(GFDL-self|GFDL-self-no-disclaimers)\|([^\}]+)\}\}',
+ u'{{Self|GFDL|\\2|author=[[:%(lang)s:User:%(author)s|%(author)s]] at
[http://%(lang)s.%(family)s.org' +
+ u'%(lang)s.%(family)s]}}'),
+ (u'\{\{GFDL-self-with-disclaimers\|([^\}]+)\}\}',
+
u'{{Self|GFDL-with-disclaimers|\\1|author=[[:%(lang)s:User:%(author)s|%(author)s]]
at' +
+
u'[http://%(lang)s.%(family)s.org %(lang)s.%(family)s]}}'),
+ (u'\{\{PD-self(\|date=[^\}]+)?\}\}',
+ u'{{PD-user-w|%(lang)s|%(family)s|%(author)s}}'),
+ (u'\{\{Multilicense replacing
placeholder(\|[^\}\|=]+=[^\}\|]+)*(?P<migration>\|[^\}\|=]+=[^\}\|]+)' +
+ u'(\|[^\}\|=]+=[^\}\|]+)*\}\}',
+
u'{{Self|GFDL|Cc-by-sa-2.5,2.0,1.0\\g<migration>|author=[[:%(lang)s:User:%(author)s|%(author)s]]'
+
+ u' at [http://%(lang)s.%(family)s.org %(lang)s.%(family)s]}}'),
+ (u'\{\{Multilicense replacing placeholder new(\|class=[^\}]+)?\}\}',
+
u'{{Self|GFDL|Cc-by-sa-3.0,2.5,2.0,1.0|author=[[:%(lang)s:User:%(author)s|%(author)s]]
at' +
+
u'[http://%(lang)s.%(family)s.org %(lang)s.%(family)s]}}'),
+ ],
+ 'lb': [(u'\{\{(self|self2)\|([^\}]+)\}\}',
u'{{Self|\\2|author=[[:%(lang)s:User:%(author)s|%(author)s]]' +
+ u' at [http://%(lang)s.%(family)s.org %(lang)s.%(family)s]}}'),
+ ],
+ 'nds-nl': [(u'\{\{PD-eigenwark\}\}',
u'{{PD-user-w|%(lang)s|%(family)s|%(author)s}}'),
+ ],
+ 'shared': [(u'\{\{(self|self2)\|([^\}]+)\}\}',
u'{{Self|\\2|author=%(author)s at old wikivoyage shared}}'),
+ ],
+}
+
+sourceGarbage = {
+ 'de': [u'==\s*Beschreibung,\sQuelle\s*==',
+ u'==\s*Beschrieving\s*==',
+ u'==\s*\[\[Wikipedia:Lizenzvorlagen für Bilder\|Lizenz\]\]\s*==',
+ ],
+ 'en': [u'==\s*Description\s*==',
+ u'==\s*Summary\s*==',
+ u'==\s*Licensing:?\s*==',
+ u'\{\{'
+ u'(Copy to Wikimedia Commons|Move to Commons|Move to commons|'
+ u'Move to Wikimedia Commons|Copy to commons|Mtc|MtC|MTC|CWC|CtWC|'
+ u'CTWC|Ctwc|Tocommons|Copy to Commons|To Commons|Movetocommons|'
+ u'Move to Wikimedia commons|Move-to-commons|Commons ok|ToCommons|'
+ u'To commons|MoveToCommons|Copy to wikimedia commons|'
+ u'Upload to commons|CopyToCommons|Copytocommons|MITC|MovetoCommons|'
+ u'Do move to Commons|Orphan image)'
+ u'(\|[^\}]+)?\}\}'
+ ],
+ 'lb': [u'==\s*Résumé\s*==',
+ u'==\s*Lizenz:\s*==',
+ ],
+ 'nds-nl': [u'==\s*Licentie\s*==',
+ u'\{\{DEFAULTSORT:\{\{PAGENAME\}\}\}\}',
+ ],
+ 'shared': [u'==\s*Beschreibung,\sQuelle\s*==',
+ u'==\s*Licensing:?\s*==',
+ ],
+}
+
+informationTemplate = {
+ 'de': 'Information',
+ 'en': 'Information',
+ 'nds-nl': 'Information',
+ 'shared': 'Information',
+}
+
+informationFields = {
+ 'de': {
+ u'anmerkungen': u'remarks', # FIXME: More flexible
+ u'beschreibung': u'description',
+ u'quelle': u'source',
+ u'datum': u'date',
+ u'urheber': u'author',
+ u'permission': u'permission',
+ u'andere Versione': u'other versions',
+ },
+ 'en': {
+ u'location': u'remarks',
+ u'description': u'description',
+ u'source': u'source',
+ u'date': u'date',
+ u'author': u'author',
+ u'permission': u'permission',
+ u'other versions': u'other versions',
+ },
+ 'nds-nl': {
+ u'location': u'remarks',
+ u'description': u'description',
+ u'source': u'source',
+ u'date': u'date',
+ u'author': u'author',
+ u'permission': u'permission',
+ u'other versions': u'other versions',
+ },
+ 'shared': {
+ u'description': u'description',
+ u'source': u'source',
+ u'date': u'date',
+ u'author': u'author',
+ u'permission': u'permission',
+ u'other versions': u'other versions',
+ },
+}
+
+
+def supportedSite():
+ """Check if this site is supported."""
+ site = pywikibot.Site()
+ lang = site.code
+
+ lists = [
+ nowCommonsTemplate,
+ moveToCommonsTemplate,
+ skipTemplates,
+ licenseTemplates,
+ sourceGarbage,
+ ]
+ for l in lists:
+ if not l.get(lang):
+ return False
+ return True
+
+
+class imageFetcher(threading.Thread):
+
+ """Tries to fetch information for all images in the
generator."""
+
+ def __init__(self, pagegenerator, prefetchQueue):
+ self.pagegenerator = pagegenerator
+ self.prefetchQueue = prefetchQueue
+ imagerecat.initLists()
+ threading.Thread.__init__(self)
+
+ def run(self):
+ for page in self.pagegenerator:
+ self.processImage(page)
+ self.prefetchQueue.put(None)
+ print(u'Fetched all images.')
+ return True
+
+ def processImage(self, page):
+ """Work on a single image."""
+ if page.exists() and (page.namespace() == 6) and \
+ (not page.isRedirectPage()):
+ imagepage = pywikibot.FilePage(page.site(), page.title())
+
+ # First do autoskip.
+ if self.doiskip(imagepage):
+ pywikibot.output(
+ u'Skipping %s : Got a template on the skip list.'
+ % page.title())
+ return False
+
+ text = imagepage.get()
+ foundMatch = False
+ for (regex, replacement) in licenseTemplates[page.site.language()]:
+ match = re.search(regex, text, flags=re.IGNORECASE)
+ if match:
+ foundMatch = True
+ if not foundMatch:
+ pywikibot.output(
+ u'Skipping %s : No suitable license template was found.'
+ % page.title())
+ return False
+ self.prefetchQueue.put(self.getNewFields(imagepage))
+
+ def doiskip(self, imagepage):
+ """Skip this image or not.
+
+ Returns True if the image is on the skip list, otherwise False
+
+ """
+ for template in imagepage.templates():
+ if template in skipTemplates[imagepage.site.language()]:
+ pywikibot.output(
+ u'Found %s which is on the template skip list' % template)
+ return True
+ return False
+
+ def getNewFields(self, imagepage):
+ """Build a new description based on the
imagepage."""
+ if u'{{Information' in imagepage.get() or \
+ u'{{information' in imagepage.get():
+ (description, date, source, author, permission,
+ other_versions) = self.getNewFieldsFromInformation(imagepage)
+ else:
+ (description, date, source,
+ author) = self.getNewFieldsFromFreetext(imagepage)
+ permission = u''
+ other_versions = u''
+
+ licensetemplate = self.getNewLicensetemplate(imagepage)
+ categories = self.getNewCategories(imagepage)
+ return {u'imagepage': imagepage,
+ u'filename': imagepage.title(withNamespace=False),
+ u'description': description,
+ u'date': date,
+ u'source': source,
+ u'author': author,
+ u'permission': permission,
+ u'other_versions': other_versions,
+ u'licensetemplate': licensetemplate,
+ u'categories': categories,
+ u'skip': False}
+
+ def getNewFieldsFromInformation(self, imagepage):
+ """Try to extract fields from current information template for the
new information template."""
+ # fields = [u'location', u'description', u'source',
u'date', u'author', u'permission', u'other versions']
+ # FIXME: The implementation for German has to be checked for the
"strange" fields
+
+ description = u''
+ source = u''
+ date = u''
+ author = u''
+ permission = u''
+ other_versions = u''
+ contents = {}
+
+ for key, value in informationFields.get(
+ imagepage.site.language()).items():
+ contents[value] = u''
+
+ templates = imagepage.templatesWithParams()
+
+ for (template, params) in templates:
+ if template == u'Information':
+ for param in params:
+ # Split at =
+ (field, sep, value) = param.partition(u'=')
+ # To lowercase, remove underscores and strip of spaces
+ field = field.lower().replace(u'_', u' ').strip()
+ # See if first part is in fields list
+ if field in informationFields.get(imagepage.site.language()).keys():
+ # Ok, field is good, store it.
+
contents[informationFields.get(imagepage.site.language()).get(field)] = value.strip()
+
+ # We now got the contents from the old information template. Let's get the
info for the new one
+
+ # Description
+ # FIXME: Add {{<lang>|<original text>}} if <lang is valid at
Commons
+ if contents[u'description']:
+ description = self.convertLinks(contents[u'description'],
+ imagepage.site())
+ if contents.get(u'remarks') and contents[u'remarks']:
+ if description == u'':
+ description = self.convertLinks(contents[u'remarks'],
+ imagepage.site())
+ else:
+ description += u'<BR/>\n' + self.convertLinks(
+ contents[u'remarks'], imagepage.site())
+
+ # Source
+ source = self.getSource(imagepage,
+ source=self.convertLinks(contents[u'source'],
+ imagepage.site()))
+
+ # Date
+ if contents[u'date']:
+ date = contents[u'date']
+ else:
+ date = self.getUploadDate(imagepage)
+
+ # Author
+ if not (contents[u'author'] == u'' or
+ contents[u'author'] == self.getAuthor(imagepage)):
+ author = self.convertLinks(contents[u'author'], imagepage.site())
+ else:
+ author = self.getAuthorText(imagepage)
+
+ # Permission
+ # Still have to filter out crap like "see below" or "yes"
+ if contents[u'permission']:
+ # Strip of the license temlate if it's in the permission section
+ for (regex, repl) in licenseTemplates[imagepage.site.language()]:
+ contents[u'permission'] = re.sub(regex, u'',
+ contents[u'permission'],
+ flags=re.IGNORECASE)
+ permission = self.convertLinks(contents[u'permission'],
+ imagepage.site())
+
+ # Other_versions
+ if contents[u'other versions']:
+ other_versions = self.convertLinks(contents[u'other versions'],
+ imagepage.site())
+
+ return (description, date, source, author, permission, other_versions)
+
+ def getNewFieldsFromFreetext(self, imagepage):
+ """Try to extract fields from free text for the new information
template."""
+ text = imagepage.get()
+ # text = re.sub(u'== Summary ==', u'', text, re.IGNORECASE)
+ # text = re.sub(u'== Licensing ==', u'', text, re.IGNORECASE)
+ # text = re.sub(u'\{\{(self|self2)\|[^\}]+\}\}', u'', text,
re.IGNORECASE)
+
+ for toRemove in sourceGarbage[imagepage.site.language()]:
+ text = re.sub(toRemove, u'', text, flags=re.IGNORECASE)
+
+ for (regex, repl) in licenseTemplates[imagepage.site.language()]:
+ text = re.sub(regex, u'', text, flags=re.IGNORECASE)
+
+ text = pywikibot.removeCategoryLinks(text, imagepage.site()).strip()
+
+ description = self.convertLinks(text.strip(), imagepage.site())
+ date = self.getUploadDate(imagepage)
+ source = self.getSource(imagepage)
+ author = self.getAuthorText(imagepage)
+ return (description, date, source, author)
+
+ def getUploadDate(self, imagepage):
+ """Get the original upload date for usage.
+
+ The date is put in the date field of the new
+ information template. If we really have nothing better.
+
+ """
+ uploadtime = imagepage.getFileVersionHistory()[-1][0]
+ uploadDatetime = datetime.strptime(uploadtime, u'%Y-%m-%dT%H:%M:%SZ')
+ return (u'{{Date|' + str(uploadDatetime.year) + u'|' +
str(uploadDatetime.month) + u'|' + str(uploadDatetime.day) +
+ u'}} (original upload date)')
+
+ def getSource(self, imagepage, source=u''):
+ """Get the text to put in the source field of the new information
template."""
+ site = imagepage.site()
+ lang = site.code
+ family = site.family.name
+ if source == u'':
+ source = u'{{Own}}'
+
+ return source.strip() + u'<BR />Transferred from
[http://%(lang)s.%(family)s.org %(lang)s.%(family)s]' \
+ % {u'lang': lang, u'family': family}
+
+ def getAuthorText(self, imagepage):
+ """Get the original uploader to put in the author field of the new
information template."""
+ site = imagepage.site()
+ lang = site.code
+ family = site.family.name
+
+ firstuploader = self.getAuthor(imagepage)
+ return (u'[[:%(lang)s:User:%(firstuploader)s|%(firstuploader)s]] at
[http://%(lang)s.%(family)s.org %(lang)s.%(family)s]'
+ % {u'lang': lang, u'family': family,
+ u'firstuploader': firstuploader})
+
+ def getAuthor(self, imagepage):
+ """Get the first uploader."""
+ return imagepage.getFileVersionHistory()[-1][1].strip()
+
+ def convertLinks(self, text, sourceSite):
+ """Convert links from the current wiki to
Commons."""
+ lang = sourceSite.code
+ family = sourceSite.family.name
+ conversions = [
+ (u'\[\[([^\[\]\|]+)\|([^\[\]\|]+)\]\]',
u'[[:%(lang)s:\\1|\\2]]'),
+ (u'\[\[([^\[\]\|]+)\]\]', u'[[:%(lang)s:\\1|\\1]]'),
+ ]
+ for (regex, replacement) in conversions:
+ text = re.sub(regex, replacement % {u'lang': lang,
+ u'family': family}, text)
+ return text
+
+ def getNewLicensetemplate(self, imagepage):
+ """Get a license template to put on the image to be
uploaded."""
+ text = imagepage.get()
+ site = imagepage.site()
+ lang = site.code
+ family = site.family.name
+ result = u''
+ for (regex,
+ replacement) in licenseTemplates[imagepage.site.language()]:
+ match = re.search(regex, text, flags=re.IGNORECASE)
+ if match:
+ result = re.sub(regex, replacement, match.group(0),
+ flags=re.IGNORECASE)
+ return result % {u'author': self.getAuthor(imagepage),
+ u'lang': lang,
+ u'family': family}
+ return result
+
+ def getNewCategories(self, imagepage):
+ """Get categories for the image.
+
+ Don't forget to filter.
+
+ """
+ result = u''
+ (commonshelperCats, usage,
+ galleries) = imagerecat.getCommonshelperCats(imagepage)
+ newcats = imagerecat.applyAllFilters(commonshelperCats)
+ for newcat in newcats:
+ result += u'[[Category:' + newcat + u']] '
+ return result
+
+
+class userInteraction(threading.Thread):
+
+ """Prompt all images to the user."""
+
+ def __init__(self, prefetchQueue, uploadQueue):
+ self.prefetchQueue = prefetchQueue
+ self.uploadQueue = uploadQueue
+ self.autonomous = False
+ threading.Thread.__init__(self)
+
+ def run(self):
+ while True:
+ fields = self.prefetchQueue.get()
+ if fields:
+ self.processImage(fields)
+ else:
+ break
+ self.uploadQueue.put(None)
+ print(u'User worked on all images.')
+ return True
+
+ def setAutonomous(self):
+ """Don't do any user interaction."""
+ self.autonomous = True
+ return
+
+ def processImage(self, fields):
+ """Work on a single image."""
+ if self.autonomous:
+ # Check if the image already exists. Do nothing if the name is
+ # already taken.
+ CommonsPage = pywikibot.Page(pywikibot.Site('commons',
+ 'commons'),
+ u'File:' +
fields.get('filename'))
+ if CommonsPage.exists():
+ return False
+ else:
+ while True:
+ # Do the TkdialogICS to accept/reject and change te name
+ fields = TkdialogICS(fields).getnewmetadata()
+
+ if fields.get('skip'):
+ pywikibot.output(u'Skipping %s : User pressed skip.'
+ % fields.get('imagepage').title())
+ return False
+
+ # Check if the image already exists
+ CommonsPage = pywikibot.Page(pywikibot.Site('commons',
+ 'commons'),
+ u'File:' +
fields.get('filename'))
+ if not CommonsPage.exists():
+ break
+ else:
+ pywikibot.output('Image already exists, pick another name '
+ 'or skip this image')
+ # We dont overwrite images, pick another name, go to the start of the
loop
+
+ # Put the fields in the queue to be uploaded
+ self.uploadQueue.put(fields)
+
+
+class TkdialogICS(Tkdialog):
+
+ """The dialog window for image info."""
+
+ def __init__(self, fields): # imagepage, description, date, source, author,
licensetemplate, categories):
+ self.root = Tkinter.Tk()
+ # "%dx%d%+d%+d" % (width, height, xoffset, yoffset)
+ # Always appear the same size and in the bottom-left corner
+ # FIXME : Base this on the screen size or make it possible for the user
+ # to configure this
+
+ # Get all the relevant fields
+ super(TkdialogICS, self).__init__()
+ self.imagepage = fields.get('imagepage')
+ self.filename = fields.get('filename')
+
+ self.description = fields.get('description')
+ self.date = fields.get('date')
+ self.source = fields.get('source')
+ self.author = fields.get('author')
+ self.permission = fields.get('permission')
+ self.other_versions = fields.get('other_versions')
+
+ self.licensetemplate = fields.get('licensetemplate')
+ self.categories = fields.get('categories')
+ self.skip = False
+
+ # Start building the page
+ self.root.geometry("1500x400+100-100")
+ self.root.title(self.filename)
+
+ self.url = self.imagepage.permalink()
+ self.scrollbar = Tkinter.Scrollbar(self.root, orient=Tkinter.VERTICAL)
+
+ self.old_description = Tkinter.Text(self.root)
+ self.old_description.insert(Tkinter.END,
self.imagepage.get().encode('utf-8'))
+ self.old_description.config(state=Tkinter.DISABLED, height=8, width=140, padx=0,
pady=0, wrap=Tkinter.WORD,
+ yscrollcommand=self.scrollbar.set)
+
+ self.scrollbar.config(command=self.old_description.yview)
+
+ self.old_description_label = Tkinter.Label(self.root,
+ text=u'The old description was :
')
+ self.new_description_label = Tkinter.Label(self.root,
+ text=u'The new fields are :
')
+ self.filename_label = Tkinter.Label(self.root, text=u'Filename : ')
+ self.information_description_label = Tkinter.Label(self.root,
+ text=u'Description :
')
+ self.information_date_label = Tkinter.Label(self.root, text=u'Date : ')
+ self.information_source_label = Tkinter.Label(self.root, text=u'Source :
')
+ self.information_author_label = Tkinter.Label(self.root, text=u'Author :
')
+ self.information_permission_label = Tkinter.Label(self.root,
text=u'Permission : ')
+ self.information_other_versions_label = Tkinter.Label(self.root, text=u'Other
versions : ')
+
+ self.information_licensetemplate_label = Tkinter.Label(self.root,
+ text=u'License :
')
+ self.information_categories_label = Tkinter.Label(self.root,
+ text=u'Categories : ')
+
+ self.filename_field = Tkinter.Entry(self.root)
+ self.information_description = Tkinter.Entry(self.root)
+ self.information_date = Tkinter.Entry(self.root)
+ self.information_source = Tkinter.Entry(self.root)
+ self.information_author = Tkinter.Entry(self.root)
+ self.information_permission = Tkinter.Entry(self.root)
+ self.information_other_versions = Tkinter.Entry(self.root)
+ self.information_licensetemplate = Tkinter.Entry(self.root)
+ self.information_categories = Tkinter.Entry(self.root)
+
+ self.field_width = 120
+
+ self.filename_field.config(width=self.field_width)
+ self.information_description.config(width=self.field_width)
+ self.information_date.config(width=self.field_width)
+ self.information_source.config(width=self.field_width)
+ self.information_author.config(width=self.field_width)
+ self.information_permission.config(width=self.field_width)
+ self.information_other_versions.config(width=self.field_width)
+ self.information_licensetemplate.config(width=self.field_width)
+ self.information_categories.config(width=self.field_width)
+
+ self.filename_field.insert(0, self.filename)
+ self.information_description.insert(0, self.description)
+ self.information_date.insert(0, self.date)
+ self.information_source.insert(0, self.source)
+ self.information_author.insert(0, self.author)
+ self.information_permission.insert(0, self.permission)
+ self.information_other_versions.insert(0, self.other_versions)
+ self.information_licensetemplate.insert(0, self.licensetemplate)
+ self.information_categories.insert(0, self.categories)
+
+ self.browser_button = Tkinter.Button(self.root, text='View in browser',
+ command=self.open_in_browser)
+ self.skip_button = Tkinter.Button(self.root, text="Skip",
command=self.skipFile)
+ self.ok_button = Tkinter.Button(self.root, text="OK",
command=self.ok_file)
+
+ # Start grid
+ self.old_description_label.grid(row=0, column=0, columnspan=3)
+
+ self.old_description.grid(row=1, column=0, columnspan=3)
+ self.scrollbar.grid(row=1, column=3)
+ self.new_description_label.grid(row=2, column=0, columnspan=3)
+
+ # All the labels for the new fields
+ self.filename_label.grid(row=3, column=0)
+ self.information_description_label.grid(row=4, column=0)
+ self.information_date_label.grid(row=5, column=0)
+ self.information_source_label.grid(row=6, column=0)
+ self.information_author_label.grid(row=7, column=0)
+ self.information_permission_label.grid(row=8, column=0)
+ self.information_other_versions_label.grid(row=9, column=0)
+ self.information_licensetemplate_label.grid(row=10, column=0)
+ self.information_categories_label.grid(row=11, column=0)
+
+ # The new fields
+ self.filename_field.grid(row=3, column=1, columnspan=3)
+ self.information_description.grid(row=4, column=1, columnspan=3)
+ self.information_date.grid(row=5, column=1, columnspan=3)
+ self.information_source.grid(row=6, column=1, columnspan=3)
+ self.information_author.grid(row=7, column=1, columnspan=3)
+ self.information_permission.grid(row=8, column=1, columnspan=3)
+ self.information_other_versions.grid(row=9, column=1, columnspan=3)
+ self.information_licensetemplate.grid(row=10, column=1, columnspan=3)
+ self.information_categories.grid(row=11, column=1, columnspan=3)
+
+ # The buttons at the bottom
+ self.ok_button.grid(row=12, column=3, rowspan=2)
+ self.skip_button.grid(row=12, column=2, rowspan=2)
+ self.browser_button.grid(row=12, column=1, rowspan=2)
+
+ def ok_file(self):
+ """The user pressed the OK button."""
+ self.filename = self.filename_field.get()
+ self.description = self.information_description.get()
+ self.date = self.information_date.get()
+ self.source = self.information_source.get()
+ self.author = self.information_author.get()
+ self.permission = self.information_permission.get()
+ self.other_versions = self.information_other_versions.get()
+ self.licensetemplate = self.information_licensetemplate.get()
+ self.categories = self.information_categories.get()
+
+ self.root.destroy()
+
+ def getnewmetadata(self):
+ """Activate the dialog and return the new name and if the image is
skipped."""
+ self.root.mainloop()
+
+ return {u'imagepage': self.imagepage,
+ u'filename': self.filename,
+ u'description': self.description,
+ u'date': self.date,
+ u'source': self.source,
+ u'author': self.author,
+ u'permission': self.permission,
+ u'other_versions': self.other_versions,
+ u'licensetemplate': self.licensetemplate,
+ u'categories': self.categories,
+ u'skip': self.skip}
+
+ def open_in_browser(self):
+ """The user pressed the View in browser button."""
+ webbrowser.open(self.url)
+
+
+class uploader(threading.Thread):
+
+ """Upload all images."""
+
+ def __init__(self, uploadQueue):
+ self.uploadQueue = uploadQueue
+ self.checktemplate = True
+ threading.Thread.__init__(self)
+
+ def run(self):
+ while True: # Change later
+ fields = self.uploadQueue.get()
+ if fields:
+ self.processImage(fields)
+ else:
+ break
+ return True
+
+ def nochecktemplate(self):
+ """Don't want to add {{BotMoveToCommons}}."""
+ self.checktemplate = False
+ return
+
+ def processImage(self, fields):
+ """Work on a single image."""
+ cid = self.buildNewImageDescription(fields)
+ pywikibot.output(cid)
+ bot = upload.UploadRobot(url=fields.get('imagepage').fileUrl(),
+ description=cid,
useFilename=fields.get('filename'),
+ keepFilename=True, verifyDescription=False,
+ ignoreWarning=True,
+ targetSite=pywikibot.Site('commons',
'commons'))
+ bot.run()
+
+ self.tagNowcommons(fields.get('imagepage'),
fields.get('filename'))
+ self.replaceUsage(fields.get('imagepage'),
fields.get('filename'))
+
+ def buildNewImageDescription(self, fields):
+ """Build a new information template."""
+ site = fields.get('imagepage').site()
+ lang = site.code
+ family = site.family.name
+
+ cid = u''
+ if self.checktemplate:
+ cid +=
u'\n{{BotMoveToCommons|%(lang)s.%(family)s|year={{subst:CURRENTYEAR}}|month={{subst:CURRENTMONTHNAME}}|day={{subst:CURRENTDAY}}}}\n'
\
+ % {u'lang': lang, u'family': family}
+ cid += u'== {{int:filedesc}} ==\n'
+ cid += u'{{Information\n'
+ cid += u'|description=%(description)s\n' % fields
+ cid += u'|date=%(date)s\n' % fields
+ cid += u'|source=%(source)s\n' % fields
+ cid += u'|author=%(author)s\n' % fields
+ cid += u'|permission=%(permission)s\n' % fields
+ cid += u'|other_versions=%(other_versions)s\n' % fields
+ cid += u'}}\n'
+ cid += u'== {{int:license}} ==\n'
+ cid += u'%(licensetemplate)s\n' % fields
+ cid += u'\n'
+ cid += self.getOriginalUploadLog(fields.get('imagepage'))
+ cid += u'__NOTOC__\n'
+ if fields.get('categories').strip() == u'':
+ cid = cid + u'{{Subst:Unc}}'
+ else:
+ cid = cid + u'%(categories)s\n' % fields
+ return cid
+
+ def getOriginalUploadLog(self, imagepage):
+ """Get the original upload log to put at the bottom of the image
description page at Commons."""
+ filehistory = imagepage.getFileVersionHistory()
+ filehistory.reverse()
+
+ site = imagepage.site()
+ lang = site.code
+ family = site.family.name
+
+ sourceimage = imagepage.site.get_address(
+ imagepage.title()).replace(u'&redirect=no&useskin=monobook',
u'')
+
+ result = u'== {{Original upload log}} ==\n'
+ result += (u'The original description page is/was
[http://%(lang)s.%(family)s.org%(sourceimage)s here]. All following' +
+ u'user names refer to %(lang)s.%(family)s.\n'
+ % {u'lang': lang, u'family': family,
+ u'sourceimage': sourceimage})
+ for (timestamp, username, resolution, size, comment) in filehistory:
+ date = datetime.strptime(
+ timestamp, u'%Y-%m-%dT%H:%M:%SZ').strftime('%Y-%m-%d
%H:%M')
+ result += (u'* %(date)s [[:%(lang)s:user:%(username)s|%(username)s]]
%(resolution)s' +
+ u' (%(size)s bytes)
\'\'<nowiki>%(comment)s</nowiki>\'\'\n' % {
+ u'lang': lang,
+ u'family': family,
+ u'date': date,
+ u'username': username,
+ u'resolution': resolution,
+ u'size': size,
+ u'comment': comment})
+
+ return result
+
+ def tagNowcommons(self, imagepage, filename):
+ """Tagged the imag which has been moved to Commons for
deletion."""
+ if pywikibot.Page(pywikibot.Site('commons', 'commons'),
+ u'File:' + filename).exists():
+ # Get a fresh copy, force to get the page so we dont run into edit
+ # conflicts
+ imtxt = imagepage.get(force=True)
+
+ # Remove the move to commons templates
+ if imagepage.site.language() in moveToCommonsTemplate:
+ for moveTemplate in moveToCommonsTemplate[imagepage.site.language()]:
+ imtxt = re.sub(u'(?i)\{\{' + moveTemplate +
+ u'[^\}]*\}\}', u'', imtxt)
+
+ # add {{NowCommons}}
+ if imagepage.site.language() in nowCommonsTemplate:
+ addTemplate = nowCommonsTemplate[
+ imagepage.site.language()] % filename
+ else:
+ addTemplate = nowCommonsTemplate['_default'] % filename
+
+ commentText = i18n.twtranslate(
+ imagepage.site(), 'commons-file-now-available',
+ {'localfile': imagepage.title(withNamespace=False),
+ 'commonsfile': filename})
+
+ pywikibot.showDiff(imagepage.get(), imtxt + addTemplate)
+ imagepage.put(imtxt + addTemplate, comment=commentText)
+
+ def replaceUsage(self, imagepage, filename):
+ """If the image is uploaded under a different name, replace all
usage."""
+ if imagepage.title(withNamespace=False) != filename:
+ gen = pagegenerators.FileLinksGenerator(imagepage)
+ preloadingGen = pagegenerators.PreloadingGenerator(gen)
+
+ moveSummary = i18n.twtranslate(
+ imagepage.site(), 'commons-file-moved',
+ {'localfile': imagepage.title(withNamespace=False),
+ 'commonsfile': filename})
+
+ imagebot = image.ImageRobot(generator=preloadingGen,
+ oldImage=imagepage.title(withNamespace=False),
+ newImage=filename, summary=moveSummary,
+ always=True, loose=True)
+ imagebot.run()
+
+
+def main(*args):
+ generator = None
+ autonomous = False
+ checkTemplate = True
+
+ # Load a lot of default generators
+ genFactory = pagegenerators.GeneratorFactory()
+ local_args = pywikibot.handle_args(args)
+ for arg in local_args:
+ if arg == '-nochecktemplate':
+ checkTemplate = False
+ elif arg == '-autonomous':
+ autonomous = True
+ else:
+ genFactory.handleArg(arg)
+
+ generator = genFactory.getCombinedGenerator()
+ if not generator:
+ pywikibot.bot.suggest_help(missing_generator=True)
+ return False
+
+ if not supportedSite():
+ pywikibot.output(u'Sorry, this site is not supported (yet).')
+ return False
+
+ pywikibot.warning(u'This is an experimental bot')
+ pywikibot.warning(u'It will only work on self published work images')
+ pywikibot.warning(u'This bot is still full of bugs')
+ pywikibot.warning(u'Use at your own risk!')
+
+ pregenerator = pagegenerators.PreloadingGenerator(generator)
+
+ prefetchQueue = Queue(maxsize=50)
+ uploadQueue = Queue(maxsize=200)
+
+ imageFetcherThread = imageFetcher(pregenerator, prefetchQueue)
+ userInteractionThread = userInteraction(prefetchQueue, uploadQueue)
+ uploaderThread = uploader(uploadQueue)
+
+ imageFetcherThread.daemon = False
+ userInteractionThread.daemon = False
+ uploaderThread.daemon = False
+
+ if autonomous:
+ pywikibot.output(u'Bot is running in autonomous mode. There will be no '
+ u'user interaction.')
+ userInteractionThread.setAutonomous()
+
+ if not checkTemplate:
+ pywikibot.output(u'No check template will be added to the uploaded '
+ u'files.')
+ uploaderThread.nochecktemplate()
+
+ # Using the listed variables one may keep track of thread start status
+ # fetchDone = imageFetcherThread.start()
+ # userDone = userInteractionThread.start()
+ # uploadDone = uploaderThread.start()
+
+
+if __name__ == "__main__":
+ main()
diff --git a/scripts/imageharvest.py b/scripts/imageharvest.py
new file mode 100644
index 0000000..c2d0705
--- /dev/null
+++ b/scripts/imageharvest.py
@@ -0,0 +1,152 @@
+# -*- coding: utf-8 -*-
+"""
+Bot for getting multiple images from an external site.
+
+It takes a URL as an argument and finds all images (and other files specified
+by the extensions in 'fileformats') that URL is referring to, asking whether to
+upload them. If further arguments are given, they are considered to be the text
+that is common to the descriptions.
+
+A second use is to get a number of images that have URLs only differing in
+numbers. To do this, use the command line option "-pattern", and give the URL
+with the variable part replaced by '$' (if that character occurs in the URL
+itself, you will have to change the bot code, my apologies).
+
+Other options:
+-shown Choose images shown on the page as well as linked from it
+-justshown Choose _only_ images shown on the page, not those linked
+"""
+# (C) Pywikibot team, 2004-2015
+#
+# Distributed under the terms of the MIT license.
+#
+from __future__ import absolute_import, unicode_literals
+
+__version__ = '$Id$'
+
+import os
+
+import BeautifulSoup
+
+import pywikibot
+
+from pywikibot.tools import PY2
+
+from scripts import upload
+
+if not PY2:
+ import urllib
+ from urllib.request import URLopener
+
+ basestring = (str,)
+else:
+ from urllib import URLopener
+
+
+def get_imagelinks(url):
+ """Given a URL, get all images linked to by the page at that
URL."""
+ links = []
+ uo = URLopener()
+ file = uo.open(url)
+ soup = BeautifulSoup.BeautifulSoup(file.read())
+ file.close()
+ if not shown:
+ tagname = "a"
+ elif shown == "just":
+ tagname = "img"
+ else:
+ tagname = ["a", "img"]
+
+ for tag in soup.findAll(tagname):
+ link = tag.get("src", tag.get("href", None))
+ if link:
+ ext = os.path.splitext(link)[1].lower().strip('.')
+ if ext in fileformats:
+ links.append(urllib.basejoin(url, link))
+ return links
+
+
+def main(give_url, image_url, desc):
+ url = give_url
+ image_url = ''
+ if url == '':
+ if image_url:
+ url = pywikibot.input(u"What URL range should I check "
+ u"(use $ for the part that is changeable)")
+ else:
+ url = pywikibot.input(u"From what URL should I get the images?")
+
+ if image_url:
+ minimum = 1
+ maximum = 99
+ answer = pywikibot.input(
+ u"What is the first number to check (default: 1)")
+ if answer:
+ minimum = int(answer)
+ answer = pywikibot.input(
+ u"What is the last number to check (default: 99)")
+ if answer:
+ maximum = int(answer)
+
+ if not desc:
+ basicdesc = pywikibot.input(
+ u"What text should be added at the end of "
+ u"the description of each image from this url?")
+ else:
+ basicdesc = desc
+
+ if image_url:
+ ilinks = []
+ i = minimum
+ while i <= maximum:
+ ilinks += [url.replace("$", str(i))]
+ i += 1
+ else:
+ ilinks = get_imagelinks(url)
+
+ for image in ilinks:
+ if pywikibot.input_yn(u'Include image %s?' % image, default=False,
automatic_quit=False):
+ desc = pywikibot.input(u"Give the description of this image:")
+ categories = []
+ while True:
+ cat = pywikibot.input(u"Specify a category (or press enter to
"
+ u"end adding categories)")
+ if not cat.strip():
+ break
+ if ":" in cat:
+ categories.append(u"[[%s]]" % cat)
+ else:
+ categories.append(u"[[%s:%s]]"
+ % (mysite.namespace(14), cat))
+ desc += "\r\n\r\n" + basicdesc + "\r\n\r\n" + \
+ "\r\n".join(categories)
+ uploadBot = upload.UploadRobot(image, description=desc)
+ uploadBot.run()
+ elif answer == 's':
+ break
+
+
+try:
+ url = u''
+ image_url = False
+ shown = False
+ desc = []
+
+ for arg in pywikibot.handle_args():
+ if arg == "-pattern":
+ image_url = True
+ elif arg == "-shown":
+ shown = True
+ elif arg == "-justshown":
+ shown = "just"
+ elif url == u'':
+ url = arg
+ else:
+ desc += [arg]
+ desc = ' '.join(desc)
+
+ fileformats = ('jpg', 'jpeg', 'png', 'gif',
'svg', 'ogg')
+ mysite = pywikibot.Site()
+ main(url, image_url, desc)
+finally:
+ pywikibot.stopme()
diff --git a/scripts/panoramiopicker.py b/scripts/panoramiopicker.py
new file mode 100644
index 0000000..d5927ed
--- /dev/null
+++ b/scripts/panoramiopicker.py
@@ -0,0 +1,379 @@
+#!/usr/bin/python
+# -*- coding: utf-8 -*-
+"""Tool to copy a Panoramio set to Commons."""
+#
+# (C) Multichill, 2010
+# (C) Pywikibot team, 2010-2015
+#
+# Distributed under the terms of the MIT license.
+#
+from __future__ import absolute_import, unicode_literals
+
+__version__ = '$Id$'
+
+import base64
+import hashlib
+import json
+import re
+import socket
+import StringIO
+
+from BeautifulSoup import BeautifulSoup
+
+import pywikibot
+
+from pywikibot import config
+
+from pywikibot.tools import PY2
+
+from scripts import imagerecat, upload
+
+if not PY2:
+ from urllib.request import urlopen
+else:
+ from urllib import urlopen
+
+try:
+ from pywikibot.userinterfaces.gui import Tkdialog
+except ImportError as _tk_error:
+ Tkdialog = None
+
+
+def isAllowedLicense(photoInfo=None):
+ """
+ Check if the image contains the right license.
+
+ TODO: Maybe add more licenses
+
+ """
+ allowed = [u'by-sa']
+ return photoInfo[u'license'] in allowed
+
+
+def downloadPhoto(photoUrl=''):
+ """
+ Download the photo and store it in a StrinIO.StringIO object.
+
+ TODO: Add exception handling
+
+ """
+ imageFile = urlopen(photoUrl).read()
+ return StringIO.StringIO(imageFile)
+
+
+def findDuplicateImages(photo=None,
+ site=pywikibot.Site(u'commons', u'commons')):
+ """Return list of duplicate images.
+
+ Takes the photo, calculates the SHA1 hash and asks the mediawiki api
+ for a list of duplicates.
+
+ TODO: Add exception handling, fix site thing
+
+ """
+ hashObject = hashlib.sha1()
+ hashObject.update(photo.getvalue())
+ return site.allimages(sha1=base64.b16encode(hashObject.digest()))
+
+
+def getLicense(photoInfo=None):
+ """Adding license to the Panoramio API with a beautiful soup
hack."""
+ photoInfo['license'] = u'c'
+ page = urlopen(photoInfo.get(u'photo_url'))
+ data = page.read()
+ soup = BeautifulSoup(data)
+ if soup.find("div", {'id': 'photo-info'}):
+ pointer = soup.find("div", {'id': 'photo-info'})
+ if pointer.find("div", {'id': 'photo-details'}):
+ pointer = pointer.find("div", {'id':
'photo-details'})
+ if pointer.find("ul", {'id': 'details'}):
+ pointer = pointer.find("ul", {'id':
'details'})
+ if pointer.find("li", {'class': 'license
by-sa'}):
+ photoInfo['license'] = u'by-sa'
+ # Does Panoramio have more license options?
+
+ return photoInfo
+
+
+def getFilename(photoInfo=None, site=pywikibot.Site(u'commons',
u'commons'),
+ project=u'Panoramio'):
+ """Build a good filename for the upload.
+
+ The name is based on the username and the title. Prevents naming collisions.
+ """
+ username = photoInfo.get(u'owner_name')
+ title = photoInfo.get(u'photo_title')
+ if title:
+ title = cleanUpTitle(title)
+ else:
+ title = u''
+
+ if pywikibot.Page(site, u'File:%s - %s - %s.jpg'
+ % (project, username, title)).exists():
+ i = 1
+ while True:
+ if (pywikibot.Page(site, u'File:%s - %s - %s (%s).jpg'
+ % (project, username, title, str(i))).exists()):
+ i += 1
+ else:
+ return u'%s - %s - %s (%s).jpg' % (project, username, title,
+ str(i))
+ else:
+ return u'%s - %s - %s.jpg' % (project, username, title)
+
+
+def cleanUpTitle(title):
+ """Clean up the title of a potential mediawiki page.
+
+ Otherwise the title of the page might not be allowed by the software.
+ """
+ title = title.strip()
+ title = re.sub(u"[<{\\[]", u"(", title)
+ title = re.sub(u"[>}\\]]", u")", title)
+ title = re.sub(u"[ _]?\\(!\\)", u"", title)
+ title = re.sub(u",:[ _]", u", ", title)
+ title = re.sub(u"[;:][ _]", u", ", title)
+ title = re.sub(u"[\t\n ]+", u" ", title)
+ title = re.sub(u"[\r\n ]+", u" ", title)
+ title = re.sub(u"[\n]+", u"", title)
+ title = re.sub(u"[?!]([.\"]|$)", u"\\1", title)
+ title = re.sub(u"[&#%?!]", u"^", title)
+ title = re.sub(u"[;]", u",", title)
+ title = re.sub(u"[/+\\\\:]", u"-", title)
+ title = re.sub(u"--+", u"-", title)
+ title = re.sub(u",,+", u",", title)
+ title = re.sub(u"[-,^]([.]|$)", u"\\1", title)
+ title = title.replace(u" ", u"_")
+ return title
+
+
+def getDescription(photoInfo=None, panoramioreview=False, reviewer=u'',
+ override=u'', addCategory=u''):
+ """Build description for the image."""
+ desc = u''
+ desc += u'{{Information\n'
+ desc += u'|description=%(photo_title)s\n'
+ desc += u'|date=%(upload_date)s (upload date)\n'
+ desc += u'|source=[%(photo_url)s Panoramio]\n'
+ desc += u'|author=[%(owner_url)s?with_photo_id=%(photo_id)s %(owner_name)s]
\n'
+ desc += u'|permission=\n'
+ desc += u'|other_versions=\n'
+ desc += u'|other_fields=\n'
+ desc += u'}}\n'
+ if photoInfo.get(u'latitude') and photoInfo.get(u'longitude'):
+ desc += u'{{Location
dec|%(latitude)s|%(longitude)s|source:Panoramio}}\n'
+ desc += u'\n'
+ desc += u'=={{int:license-header}}==\n'
+
+ if override:
+ desc += override
+ else:
+ if photoInfo.get(u'license') == u'by-sa':
+ desc += u'{{Cc-by-sa-3.0}}\n'
+ if panoramioreview:
+ desc +=
u'{{Panoramioreview|%s|{{subst:CURRENTYEAR}}-{{subst:CURRENTMONTH}}-{{subst:CURRENTDAY2}}}}\n'
% (reviewer,)
+ else:
+ desc += u'{{Panoramioreview}}\n'
+
+ desc += u'\n'
+ cats = u''
+ if addCategory:
+ desc += u'\n[[Category:%s]]\n' % (addCategory,)
+ cats = True
+
+ # Get categories based on location
+ if photoInfo.get(u'latitude') and photoInfo.get(u'longitude'):
+ cats = imagerecat.getOpenStreetMapCats(photoInfo.get(u'latitude'),
+ photoInfo.get(u'longitude'))
+ cats = imagerecat.applyAllFilters(cats)
+ for cat in cats:
+ desc += u'[[Category:%s]]\n' % (cat,)
+ if not cats:
+ desc += u'{{subst:Unc}}\n'
+
+ return desc % photoInfo
+
+
+def processPhoto(photoInfo=None, panoramioreview=False, reviewer=u'',
+ override=u'', addCategory=u'', autonomous=False):
+ """Process a single Panoramio photo."""
+ if isAllowedLicense(photoInfo) or override:
+ # Should download the photo only once
+ photo = downloadPhoto(photoInfo.get(u'photo_file_url'))
+
+ # Don't upload duplicate images, should add override option
+ duplicates = findDuplicateImages(photo)
+ if duplicates:
+ pywikibot.output(u'Found duplicate image at %s' % duplicates.pop())
+ else:
+ filename = getFilename(photoInfo)
+ pywikibot.output(filename)
+ description = getDescription(photoInfo, panoramioreview,
+ reviewer, override, addCategory)
+
+ pywikibot.output(description)
+ if not autonomous:
+ (newDescription, newFilename, skip) = Tkdialog(
+ description, photo, filename).show_dialog()
+ else:
+ newDescription = description
+ newFilename = filename
+ skip = False
+# pywikibot.output(newPhotoDescription)
+# if (pywikibot.Page(title=u'File:'+ filename,
+# site=pywikibot.Site()).exists()):
+# # I should probably check if the hash is the same and if not upload
+# # it under a different name
+# pywikibot.output(u'File:' + filename + u' already
exists!')
+# else:
+ # Do the actual upload
+ # Would be nice to check before I upload if the file is already at
+ # Commons
+ # Not that important for this program, but maybe for derived
+ # programs
+ if not skip:
+ bot = upload.UploadRobot(photoInfo.get(u'photo_file_url'),
+ description=newDescription,
+ useFilename=newFilename,
+ keepFilename=True,
+ verifyDescription=False)
+ bot.upload_image(debug=False)
+ return 1
+ return 0
+
+
+def getPhotos(photoset=u'', start_id='', end_id='',
interval=100):
+ """Loop over a set of Panoramio photos."""
+ i = 0
+ has_more = True
+ url =
u'http://www.panoramio.com/map/get_panoramas.php?set=%s&from=%s&…
+ while has_more:
+ gotInfo = False
+ maxtries = 10
+ tries = 0
+ while not gotInfo:
+ try:
+ if tries < maxtries:
+ tries += 1
+ panoramioApiPage = urlopen(url % (photoset, i, i + interval))
+ contents = panoramioApiPage.read().decode('utf-8')
+ gotInfo = True
+ i += interval
+ else:
+ break
+ except IOError:
+ pywikibot.output(u'Got an IOError, let\'s try again')
+ except socket.timeout:
+ pywikibot.output(u'Got a timeout, let\'s try again')
+
+ print(contents)
+ metadata = json.loads(contents)
+ photos = metadata.get(u'photos')
+ for photo in photos:
+ yield photo
+ has_more = metadata.get(u'has_more')
+ return
+
+
+def usage():
+ """Print usage information.
+
+ TODO : Need more.
+ """
+ pywikibot.output(
+ u"Panoramiopicker is a tool to transfer Panaramio photos to Wikimedia
"
+ u"Commons")
+ pywikibot.output(u"-set:<set_id>\n")
+ return
+
+
+def main(*args):
+ """Process command line arguments and perform task."""
+ site = pywikibot.Site(u'commons', u'commons')
+ config.family = site.family
+ config.lang = site.lang
+# imagerecat.initLists()
+
+ photoset = u'' # public (popular photos), full (all photos), user ID number
+ start_id = u''
+ end_id = u''
+ addCategory = u''
+ autonomous = False
+ totalPhotos = 0
+ uploadedPhotos = 0
+
+ # Do we mark the images as reviewed right away?
+ if config.panoramio['review']:
+ panoramioreview = config.panoramio['review']
+ else:
+ panoramioreview = False
+
+ # Set the Panoramio reviewer
+ if config.panoramio['reviewer']:
+ reviewer = config.panoramio['reviewer']
+ elif 'commons' in config.sysopnames['commons']:
+ print(config.sysopnames['commons'])
+ reviewer = config.sysopnames['commons']['commons']
+ elif 'commons' in config.usernames['commons']:
+ reviewer = config.usernames['commons']['commons']
+ else:
+ reviewer = u''
+
+ # Should be renamed to overrideLicense or something like that
+ override = u''
+ local_args = pywikibot.handle_args(args)
+ for arg in local_args:
+ if arg.startswith('-set'):
+ if len(arg) == 4:
+ photoset = pywikibot.input(u'What is the set?')
+ else:
+ photoset = arg[5:]
+ elif arg.startswith('-start_id'):
+ if len(arg) == 9:
+ start_id = pywikibot.input(
+ u'What is the id of the photo you want to start at?')
+ else:
+ start_id = arg[10:]
+ elif arg.startswith('-end_id'):
+ if len(arg) == 7:
+ end_id = pywikibot.input(
+ u'What is the id of the photo you want to end at?')
+ else:
+ end_id = arg[8:]
+ elif arg == '-panoramioreview':
+ panoramioreview = True
+ elif arg.startswith('-reviewer'):
+ if len(arg) == 9:
+ reviewer = pywikibot.input(u'Who is the reviewer?')
+ else:
+ reviewer = arg[10:]
+ elif arg.startswith('-override'):
+ if len(arg) == 9:
+ override = pywikibot.input(u'What is the override text?')
+ else:
+ override = arg[10:]
+ elif arg.startswith('-addcategory'):
+ if len(arg) == 12:
+ addCategory = pywikibot.input(
+ u'What category do you want to add?')
+ else:
+ addCategory = arg[13:]
+ elif arg == '-autonomous':
+ autonomous = True
+
+ if photoset:
+ for photoInfo in getPhotos(photoset, start_id, end_id):
+ photoInfo = getLicense(photoInfo)
+ # time.sleep(10)
+ uploadedPhotos += processPhoto(photoInfo, panoramioreview,
+ reviewer, override, addCategory,
+ autonomous)
+ totalPhotos += 1
+ else:
+ usage()
+ pywikibot.output(u'Finished running')
+ pywikibot.output(u'Total photos: ' + str(totalPhotos))
+ pywikibot.output(u'Uploaded photos: ' + str(uploadedPhotos))
+
+if __name__ == "__main__":
+ main()
diff --git a/tests/script_tests.py b/tests/script_tests.py
index 9dfcc94..0e94376 100644
--- a/tests/script_tests.py
+++ b/tests/script_tests.py
@@ -23,17 +23,26 @@
scripts_path = os.path.join(_root_dir, 'scripts')
+if PY2:
+ TK_IMPORT = 'Tkinter'
+else:
+ TK_IMPORT = 'tkinter'
+
# These dependencies are not always the package name which is in setup.py.
# e.g. 'PIL.ImageTk' is a object provided by several different pypi packages,
# and setup.py requests that 'Pillow' is installed to provide
'PIL.ImageTk'.
# Here, it doesnt matter which pypi package was requested and installed.
# Here, the name given to the module which will be imported is required.
script_deps = {
+ 'imagecopy': [TK_IMPORT],
+ 'imagecopy_self': [TK_IMPORT],
'script_wui': ['crontab', 'lua'],
# Note: package 'lunatic-python' provides module 'lua'
'flickrripper': ['flickrapi'],
+ 'imageharvest': ['BeautifulSoup'],
'match_images': ['PIL.ImageTk'],
+ 'panoramiopicker': ['BeautifulSoup'],
'states_redirect': ['pycountry'],
'patrol': ['mwlib'],
'weblinkchecker.py': ['memento_client'],
@@ -138,8 +147,10 @@
'imageuncat': 'WARNING: This script is primarily written for Wikimedia
Commons',
# script_input['interwiki'] above lists a title that should not exist
'interwiki': 'does not exist. Skipping.',
+ 'imageharvest': 'From what URL should I get the images',
'login': 'Logged in on ',
'pagefromfile': 'Please enter the file name',
+ 'panoramiopicker': 'Panoramiopicker is a tool to transfer Panaramio
',
'replace': 'Press Enter to use this automatic message',
'script_wui': 'Pre-loading all relevant page contents',
'shell': ('>>> ', 'Welcome to the'),
--
To view, visit
https://gerrit.wikimedia.org/r/196630
To unsubscribe, visit
https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I5ea3a2131badba22fdc5e99deb5c40a49f4f0998
Gerrit-PatchSet: 12
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Prianka <priyankajayaswal025(a)gmail.com>
Gerrit-Reviewer: John Vandenberg <jayvdb(a)gmail.com>
Gerrit-Reviewer: Ladsgroup <ladsgroup(a)gmail.com>
Gerrit-Reviewer: Merlijn van Deen <valhallasw(a)arctus.nl>
Gerrit-Reviewer: Ricordisamoa <ricordisamoa(a)openmailbox.org>
Gerrit-Reviewer: XZise <CommodoreFabianus(a)gmx.de>
Gerrit-Reviewer: jenkins-bot <>