Hi all
I propose two small changes to MediaWiki, which would allow scripts and bots to parse the HTML output of MediaWiki more easily. (I'm crossposting this to wikitech-l and pywikipediabot-users, because it's relevant to both).
The first proposal is a minimal skin that adds only the absolutely necessary HTML around the article content. This would remove a lot of overhead for bots (in terms of parsing and loading), and meight also be handy for people using a PDA or cell phone to browse a wiki; This way, server load would also be reduced (a little). Here's the feature request (patch included): http://bugzilla.wikimedia.org/show_bug.cgi?id=3651
The second proposal is a pseudo-language, called "bot" or "none", that would cause MediaWiki to return system messages untranslated, in the form {@[key]@}. This way, system messages can be regognized and parsed easily - no need to deal with different languages, or with people changing the message. This would also save some effort on the server side (no need to look into the database for messages). Here's the feature request (patch included): http://bugzilla.wikimedia.org/show_bug.cgi?id=3652
Hint: you can force a specific skin and language to be used by adding uselang=bla (resp. useskin=bla) to the URL. This also works without logging in.
Both patches are not intended to replace a full featured bot API. They are simple hacks that can be applied without much pain. Bots would only need minimal changes to be able to use those features. Much breakage due to changin system messages could be avoided that way, for instance.
Please give some feedback. If you like my proposals, please comment in bugzilla, so we may actually get this running on Wikimedia's servers soon.
Cheers, Daniel
I propose two small changes to MediaWiki, which would allow scripts and bots to parse the HTML output of MediaWiki more easily. (I'm crossposting this to wikitech-l and pywikipediabot-users, because it's relevant to both).
This sounds horribly redundant to me. I don't know how far the bot API effort has got (has any work on it been done at all?), but what you're proposing doesn't help it; it only adds bulk to the software.
The second proposal is a pseudo-language, called "bot" or "none", that would cause MediaWiki to return system messages untranslated, in the form {@[key]@}.
I wrote such a feature before, but with the parameter &debugmsg=1 rather than as a "pseudo-language". It was live at some point, but apparently it's not anymore. Someone must have reverted it.
Timwi
Timwi wrote:
I propose two small changes to MediaWiki, which would allow scripts and bots to parse the HTML output of MediaWiki more easily. (I'm crossposting this to wikitech-l and pywikipediabot-users, because it's relevant to both).
This sounds horribly redundant to me. I don't know how far the bot API effort has got (has any work on it been done at all?), but what you're proposing doesn't help it; it only adds bulk to the software.
I am, slowly. I'm focussing more on a client-side reader/editor rather than bots.
My effort includes things like Special:Getimage (redirects to the URL of the given image) and giving other special pages and XML output (so the data is easily parsed).
The second proposal is a pseudo-language, called "bot" or "none", that would cause MediaWiki to return system messages untranslated, in the form {@[key]@}.
I wrote such a feature before, but with the parameter &debugmsg=1 rather than as a "pseudo-language". It was live at some point, but apparently it's not anymore. Someone must have reverted it.
That is a feature I don't quite understand. Why would a bot want it? To simplify screen-scraping?
What about messages like [[MediaWiki:Sidebar]], which are not litterally pasted?
-- Jamie ------------------------------------------------------------------- http://endeavour.zapto.org/astro73/ Thank you to JosephM for inviting me to Gmail! Have lots of invites. Gmail now has 2GB.
Jamie Bliss wrote:
I wrote such a feature before, but with the parameter &debugmsg=1 rather than as a "pseudo-language". It was live at some point, but apparently it's not anymore. Someone must have reverted it.
That is a feature I don't quite understand. Why would a bot want it? To simplify screen-scraping?
Well, when *I* wrote it it was not targeted for bots, but rather for people trying to translate the interface or trying to correct an error they found. Seeing a string in context gives you a better idea of what it should translate as, so I thought it would be useful to be able to append &debugmsg=1 to see the stringcode and then find it to do the translation/correction.
What about messages like [[MediaWiki:Sidebar]], which are not litterally pasted?
That *may* be the reason the feature was removed again.
wikitech-l@lists.wikimedia.org