On Thursday 26 January 2006 21:04, Ævar Arnfjörð Bjarmason wrote:
There are several issues with implementing a robot API
like that, one
is that a lot of our logic is still tied to our current XHTML output
code, which would have to be split off into a backend and presentation
frontends.
i have spent quite a lot of though on a machine-friendly API for MediaWiki.
naturally, at first i focussed on getting some SOAP, XML-RPC or REST-API.
then i ran into the above mentioned problem: i would not like to implement
an API if i don't have a cleanly separated backend i can use. this would
mean much work.
the next thing i did was: i wrote an API using simple screen-scrapping,
which works surprisingly good. you can generate regexps from
[[Special:Allmessages]] output and take the namespaces from
[Special:Export]]. using this API for some time now i noticed: the forms
MediaWiki uses and the format of its XHTML seldomly change. the XHTML tags
have most of the classes and ids you need find the data you need inside
pages.
in a little brainstorming with [[meta:User:Duesentrieb]] the idea
of a bot-language appeared, which would make a screenscrapping-API
quite stable. seen from the "we need a properly separated backend layer,
then we build an API on top", this is ugly, and it does not go far enough
for the client-part of the API. besides, brion hates it, because it's not
the right way to do it.
after that, i began to see MediaWiki as a somewhat convoluted server for
some ajax-application i was building atop of it. again, it's surprising
how good such a thing works.
then it came to me: we nearly have a usable REST-style API.
there's three parts to an API: a path for data to get into the server.
for this purpose, i think the forms are usable, because they are stable
enough. if the forms change, the API changes, yes, but when they are
changed there is a reason for it, that in many cases would make a
change in the API necessary anyway.
the second part is getting data out of the server: it has to be
marshalled in some way the client can understand. the XHTML output
our skins produce is not perfect, but it's possible to write
a skin that generates well enough structured XHTML with more ids
and classes to use the output everywhere XML can be parsed and
queried.
the third part is the application logic working with the incoming
data and producing outgoing data. this part we get for free - it's
already there, it's properly tested an i would not want to mess
with it.
daniel