Damnit... SF mailing lists strike again... ^_^ Any chance of also improving interface/usability of interface? The user-config.py file is quite basic, it's got nothing but basic data and configuration, it could easily be simplified to use a code modifiable format like ini or xml. I see a lot of users who have the initial issue "Why does my configuration file not work?" or they don't even know how to use it. Then there are the people with the issue of the family files... Honestly, the rewrite is already trying to move to an API format, and there is very little data put in family files which we can't get from the API... Almost nothing except URL related things. Ok, bots need to be run from cli, otherwise options can't be passed to them, so I'm not saying we should use a GUI. However, python is built with the ability to act as a shell.
Why don't we turn this into something run by an interactive shell, people already need to open up a cli, by why make them try and figure out how to handle a confusing set of other files when we could probably let them handle the data over the cli.
So to startup someone would type this to enter the shell: python pywiki.py To run a bot:
run replace wiki=foo start=! regex=True "Bar[0-9]" "Baz5"
The wiki= parameter would accept one of four forms 'id' which would use the wiki with the user's default language, 'lang' which would use the default wiki on a different language, 'lang.id' which would use the wiki with the language, or for quick and easy use when some new wiki asks "Can you quickly replace some text on the wiki us?" if there is a :// in the input it takes it as a url "http://naruto.wikia.com" and grabs the needed info from the API. Will work for 99% of wiki, but you may need to add an alias if it needs special config. To deal with the user stuff we can let them set users:
config user en.wikipedia normal=FooMan sysop=FooOp
And instead of long family files, just let them specify id's which are mapped to urls... Like git's remote command
wiki add en.anime http://anime.wikia.com
And it'll retrieve the info from the api... Of course for wiki like Wikipedia:
wiki add en.wikipedia http://en.wikipedia.org/w
And the data will be grabbed from the API for the namespaces versions and such. To refresh our namespaces and other data from the site:
wiki refresh en.wikipedia
When there are extra options that config needs that we can't get through the api they can be set: set-prop is kinda svn inspired... option, param, conf, prop, or whatver you think works best and most people will understand we can pick. Say we needed to set the nicepath for some reason and the default wasn't good...
wiki prop en.wikipedia nicepath /wiki/$1
Of course, all this kind of data is retriveable:
wiki prop en.wikipedia nicepath
/wiki/$1
config user en.wikipedia
Normal: FooMan Sysop: FooOp
wiki base en.anime
http://anime.wikia.com And wiki can be changed if needed, say if the wiki moved urls (My sync bots broke when Wikia moved a wiki or two, and their redirect broke index.php urls):
wiki switch en.anime http://en.anime.wikia.com
Kind of an idea...
~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)
Nicolas Dumazet wrote:
I would encourage any dev / bot owner fluent enough in python to give a try to the rewrite, particularly if you use scripts fetching a lot of data from mediawiki.
I wrote for example, a maintenance script for the French translation project. It fetches hundreds of pages, does some mambo jumbo magic on it, and eventually use that data to update ~200 summary pages.
I first wrote it using our trunk pywikipedia. (since editing through the api is not yet available, I first thought that it was the only way). It was very, very *slow*. I wondered what improvements I would get using the rewrite, and I rewrote my script to use the rewrite for all the page fetching part. Well, I don't have precise figures, but I would say that the latter version was probably 3 to 10 times faster
In addition to being faster, the more we'll use the rewrite, the more we'll be able to detect bugs and to correct them, the easier it will be to merge the branch when API editing will get available.
(Speaking of debugging, if you're being annoyed by the debug output while writing your scripts, import logging ; logging.getLogger().setLevel(logging.INFO) in your script header will help. )