This may be off topic, but asking on Wikimedia Commons and the Pywikipediabot mailinglist didn't give me any replies. Please flame me if you think I should not make a request like the one below again on this list. ---- I would like to request a volunteer to assist (well, assist, you have to do the dirty work, I'm afraid, as I am just not skilled enough, I can spec it, though) in stabilising and fine tuning the CommonsDelinker bots replacer.py and delinker.py.
Orgullomoore, the previous maintainer of the code has gone back packing in Europe and is no longer able to maintain the bots. About two weeks ago I took over running the bot from the toolserver. In the past two weeks I have experienced a few flaws in the bots that I would like to see fixed.
If you are an experienced python coder and feel at home with pywikipediabot, threads and a little MySQL and you feel like making a difference on this crucial bot in supporting all users of Wikimedia Commons, please contact me. Having an account on toolserver is not required but is a definite advantage.
Below a description of current functionality and found flaws.
Thanks for your help.
Siebrand
Delinker: removes links from any Wikimedia wiki once a file has been deleted from Wikimedia Commons. If a special string is added to the deletion message, this action is not performed. Success and failure of edits is logged into a MySQL database on the toolserver. Edit messages can be localised and are read from the local wiki.
Flaws: * Fails if CheckUsage is too busy. Delinking is not performed. This should be recognised and retries should be performed until a complete CheckUsage was obtained * Each toolserver user is granted 15 connections to the MySQL database. Sometimes this limit is reached and in that case success and failure is not logged. This should not happen. * If a lot of files have been deleted that are used a lot on one wiki, a lot of edits can be made to that one wiki. This can (and has) upset users in a community. An edit limit of 3 edits per minute per wiki should be in place. * From the bot output it looks like it keeps on fetching the edit summary on multiple edits on a wiki. This causes unnecessary server load and should not occur. * Most main projects are supported, but especially the meta projects can give issues because of the implementation of project recognition. * TypeError: unsupported operand type(s) for -: 'unicode' and 'int' * Sometimes when toolserver is really busy (or whatever the reason may be), a thread cannot be created. In this case a task is not executed. This should not happen.
Replacer: replaces image links on any Wikimedia wiki. Tasks are fetched from a sysop only page on Wikimedia Commons. AnyImageTypeButSVG to SVG is not supported because of a pending SupersededSVG deletion debate. Success and failure of edits is logged into a MySQL database on the toolserver. Edit messages can be localised and are read from the local wiki. If {{stop}} is present on the wiki page, the bot will stop working until that text has been removed.
Flaws: The bots use a lot of similar code. Because of that, similar issues may be documented. * Fails if CheckUsage is too busy. This should be recognised and retries should be performed until a complete CheckUsage was obtained * Each toolserver user is granted 15 connections to the MySQL database. Sometimes this limit is reached and in that case success and failure is not logged. This should not happen. * If a lot of files have to be replaced that are used a lot on a wiki, a lot of edits can be made to that one wiki in a time unit. This can (and has) upset users in a community. An edit limit of 3 edits per minute per wiki should be in place. * From the bot output it looks like it keeps on fetching the edit summary on multiple edits on a wiki. This causes unnecessary server load and should not occur. * Most main projects are supported, but especially the meta projects can give issues because of the implementation of project recognition. * Image name recognition is such errors can occur in special cases. Example: A page contains images "Test 1.jpg" and "1.jpg". If "1.jpg" is removed, "Test 1.jpg" may end up as "Test ". (example unfortunately lost) * Although there is code present to prevent this issues, it has happened that a replacement was made when a local image with the same name was present. This should not happen. [1] * Bot crashes on 'Unhandled exception in thread started' * Sometimes when toolserver is really busy (or whatever the reason may be), a thread cannot be created. In this case a task is not executed. This should not happen. * If an image has more than 50 uses on a wiki and all uses are from a template that is not displayed in the first 50 pages displayed, no text replacement will occur (can this be fixed at all)
[1] http://en.wikipedia.org/w/index.php?title=User:TomStar81/World_War_II&di... rev&oldid=129050292
General wishes * Simple web interface for database logs (last edits, last edits per language, all edits for a particular image) * Some kind of registration for the bots to enable starting it from cron if it is no longer running