Damnit... SF mailing lists strike again...
^_^ Any chance of also improving interface/usability of interface?
The user-config.py file is quite basic, it's got nothing but basic data
and configuration, it could easily be simplified to use a code
modifiable format like ini or xml. I see a lot of users who have the
initial issue "Why does my configuration file not work?" or they don't
even know how to use it.
Then there are the people with the issue of the family files...
Honestly, the rewrite is already trying to move to an API format, and
there is very little data put in family files which we can't get from
the API... Almost nothing except URL related things.
Ok, bots need to be run from cli, otherwise options can't be passed to
them, so I'm not saying we should use a GUI.
However, python is built with the ability to act as a shell.
Why don't we turn this into something run by an interactive shell,
people already need to open up a cli, by why make them try and figure
out how to handle a confusing set of other files when we could probably
let them handle the data over the cli.
So to startup someone would type this to enter the shell:
To run a bot:
run replace wiki=foo start=! regex=True
The wiki= parameter would accept one of four
forms 'id' which would use
the wiki with the user's default language, 'lang' which would use the
default wiki on a different language, 'lang.id' which would use the wiki
with the language, or for quick and easy use when some new wiki asks
"Can you quickly replace some text on the wiki us?" if there is a :// in
the input it takes it as a url "http://naruto.wikia.com" and grabs the
needed info from the API. Will work for 99% of wiki, but you may need to
add an alias if it needs special config.
To deal with the user stuff we can let them set users:
config user en.wikipedia normal=FooMan sysop=FooOp
And instead of long family files, just let them specify id's which are
mapped to urls... Like git's remote command
it'll retrieve the info from the api... Of course for wiki like
And the data will be grabbed from the API for the namespaces versions
To refresh our namespaces and other data from the site:
wiki refresh en.wikipedia
When there are extra
options that config needs that we can't get through
the api they can be set: set-prop is kinda svn inspired... option,
param, conf, prop, or whatver you think works best and most people will
understand we can pick.
Say we needed to set the nicepath for some reason and the default wasn't
wiki prop en.wikipedia nicepath /wiki/$1
course, all this kind of data is retriveable:
wiki prop en.wikipedia nicepath
config user en.wikipedia
wiki base en.anime http://anime.wikia.com
And wiki can be changed if needed, say if the wiki moved urls (My sync
bots broke when Wikia moved a wiki or two, and their redirect broke
Kind of an idea...
~Daniel Friesen(Dantman) of:
-The Gaiapedia (http://gaia.wikia.com
-Wikia ACG on Wikia.com
Nicolas Dumazet wrote:
I would encourage any dev / bot owner fluent enough in
python to give
a try to the rewrite, particularly if you use scripts fetching a lot
of data from mediawiki.
I wrote for example, a maintenance script for the French translation
project. It fetches hundreds of pages, does some mambo jumbo magic on
it, and eventually use that data to update ~200 summary pages.
I first wrote it using our trunk pywikipedia. (since editing through
the api is not yet available, I first thought that it was the only
way). It was very, very *slow*.
I wondered what improvements I would get using the rewrite, and I
rewrote my script to use the rewrite for all the page fetching part.
Well, I don't have precise figures, but I would say that the latter
version was probably 3 to 10 times faster
In addition to being faster, the more we'll use the rewrite, the more
we'll be able to detect bugs and to correct them, the easier it will
be to merge the branch when API editing will get available.
(Speaking of debugging, if you're being annoyed by the debug output
while writing your scripts,
import logging ; logging.getLogger().setLevel(logging.INFO) in your
script header will help. )