MediaWiki API (SoC)

List overview All Threads
Download

newer

older

MediaWiki 1.6.5 released

MediaWiki automated test run...

Ben Francis

27 Apr 2006 27 Apr '06

12:54 a.m.

Hi,

Apologies for this dupe of a wiki discussion page, but this appears to be where all the action is.

It's fantastic news that the WikiMedia Foundation has been accepted as a mentor for Google Summer of Code 2006! The suggestion of writing a programmable REST/XML-RPC/SOAP API that exposes MediaWiki domain objects is really quite exciting and I plan to apply with this project in mind.

Do you think that this project suggestion is likely to make the final cut? I'm starting to really set my heart on the idea but I don't want to spend too much time thinking about it if it's unlikely to be used. I'm not entirely clear on how the applicaiton process works (even after reading the Google FAQ).

I commented on the wiki discussion page that there appears to be a third party API run by Ontok (http://www.ontok.com/wiki/index.php/Wikipedia) - I assume they have no direct affliation with the WikiMedia Foundation? Will there be any need to contact those people?

Also, someone has linked to the alpha version of a query interface that directly accesses the wiki database (http://en.wikipedia.org/w/query.php), but like "IndyGreg" on the discussion page I don't think this constitutes an API in the same sense described here, would you agree? Perhaps eventually this code could be integrated into the same extension.

It would be great to work with the pywikipediabot, perlmediawikiclient, and java mediawikiclient guys to create something that they could all use too, and the possibilities for the use of this API are endless!

Best Wishes

Ben

-- Ben "tola" Francis http://hippygeek.co.uk

Show replies by date

Ivan Krstic

27 Apr 27 Apr

1:16 a.m.

Ben Francis wrote:

...

The suggestion of writing a programmable REST/XML-RPC/SOAP API that exposes MediaWiki domain objects is really quite exciting and I plan to apply with this project in mind. [...] It would be great to work with the pywikipediabot, perlmediawikiclient, and java mediawikiclient guys

I've decided a while back to have an API panel at Wikimania's Hacking Days. The core MediaWiki hackers will all be there, and I'm hoping to also bring out one of the core pywikipediabot guys.

FWIW, I'd discourage this being taken on as a SoC project this summer, since it's a complicated problem that needs significant up-front discussion with a variety of people before any code should be written.

-- Ivan Krstic krstic@fas.harvard.edu | GPG: 0x147C722D

Rich Morin

1:14 p.m.

At 3:16 PM -0400 4/26/06, Ivan Krstic wrote:

...

FWIW, I'd discourage this being taken on as a SoC project this summer, since it's a complicated problem that needs significant up-front discussion with a variety of people before any code should be written.

I would love to see a first-cut API get developed, to take the place of Pywikipedia. I think that waiting until the "correct" API has been determined is inappropriate. Get something usable, robust, and reasonably speedy in place; then look for ways to refine and/or replace it over time.

-r

-- http://www.cfcl.com/rdm Rich Morin http://www.cfcl.com/rdm/resume rdm@cfcl.com http://www.cfcl.com/rdm/weblog +1 650-873-7841 Technical editing and writing, programming, and web development

Timwi

6:55 p.m.

Rich Morin wrote:

...

At 3:16 PM -0400 4/26/06, Ivan Krstic wrote:

...
FWIW, I'd discourage this being taken on as a SoC project this summer, since it's a complicated problem that needs significant up-front discussion with a variety of people before any code should be written.

I would love to see a first-cut API get developed, to take the place of Pywikipedia. I think that waiting until the "correct" API has been determined is inappropriate. Get something usable, robust, and reasonably speedy in place; then look for ways to refine and/or replace it over time.

I agree with this. You can start off with some basic functions (retrieve an article, edit an article, retrieve article history) and later move on to more complex ones. Each of these functions is largely independent of each other, and inconsistencies between them do not make the whole API fail.

Ivan Krstic

7:38 p.m.

Timwi wrote:

...

You can start off with some basic functions (retrieve an article, edit an article, retrieve article history) and later move on to more complex ones.

I thought about this some more yesterday, and I've changed my mind. A first pass at a *very* simple API, if done with care, might actually help frame later API discussions, such as the one at Hacking Days.

Too much of MediaWiki has traditionally been plagued by the "just sit down and spit out some code" approach to development, which led to an unseemly growth by agglomeration, and a codebase that's only recently been getting less horrid. Given the context -- volunteer coders, non-profit organization, etc -- some code is better than no code, so I'm not passing judgment here. But given a tangible opportunity to get a serious component like the API right (for some value of 'right' that's more than what one would obtain just by sitting down and hacking), waiting another couple of months to get everyone in a room and hash this out seems to be by far the best approach.

The API will be used by programs, not people. People can quickly adapt to changes in the UI, but every mistake that ends up in the API will probably become an incredible PITA that will need to be supported for some time going forward. So I'd still discourage the API being a SoC project. Of course, I'm not affiliated with the Foundation, so Ben is more than welcome to still submit his application for it.

-- Ivan Krstic krstic@fas.harvard.edu | GPG: 0x147C722D

Plyd

8:06 p.m.

I was wondering how it could be possible to make a real AJAX user interface without having a strong API. I've seen another SoC proposition about AJAX. Including 2 or 3 ajax tools seems to be just a step, that would be thrown away after having built a fully ajax-made interface. Is it too early for that too ?

Plyd

On 4/27/06, Ivan Krstic krstic@fas.harvard.edu wrote:

...

Timwi wrote:

...
You can start off with some basic functions (retrieve an article, edit an article, retrieve article history) and later move on to more complex ones.

I thought about this some more yesterday, and I've changed my mind. A first pass at a *very* simple API, if done with care, might actually help frame later API discussions, such as the one at Hacking Days.

Too much of MediaWiki has traditionally been plagued by the "just sit down and spit out some code" approach to development, which led to an unseemly growth by agglomeration, and a codebase that's only recently been getting less horrid. Given the context -- volunteer coders, non-profit organization, etc -- some code is better than no code, so I'm not passing judgment here. But given a tangible opportunity to get a serious component like the API right (for some value of 'right' that's more than what one would obtain just by sitting down and hacking), waiting another couple of months to get everyone in a room and hash this out seems to be by far the best approach.

The API will be used by programs, not people. People can quickly adapt to changes in the UI, but every mistake that ends up in the API will probably become an incredible PITA that will need to be supported for some time going forward. So I'd still discourage the API being a SoC project. Of course, I'm not affiliated with the Foundation, so Ben is more than welcome to still submit his application for it.

-- Ivan Krstic krstic@fas.harvard.edu | GPG: 0x147C722D _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Ivan Krstic

8:27 p.m.

Plyd wrote:

...

I was wondering how it could be possible to make a real AJAX user interface without having a strong API.

It's not clear to me that we want a "real AJAX user interface". Things like the editing preview and recent changes are somewhat obvious candidates for a sprinkling of AJAX; which other parts of the user interaction with MW do you think would benefit significantly from it?

Let's not fix things that aren't broken.

-- Ivan Krstic krstic@fas.harvard.edu | GPG: 0x147C722D

Plyd

9:02 p.m.

there are several tools that could be interesting to have in AJAX, the editing preview, but I have several other examples that could be faster for the user : - following or not a page without having to reload a new page - the [edit] button that opens a box directly inside the page without reloading another one - renaming/deleting etc. in the same way (box added at the top of the page) These tools are far from necessary but they improve the use of MediaWiki and they could reduce the number of requests for the servers.

I have also in mind a new SpecialRC.php page, in ajax too. It would be updated automatically and would offer instant tools, like viewing beginning of the diff passing just the mouse on the link, sending messages to an IP or a new user directly from the RC page, ... and helping stuff to fight vandalism. (A tool that would finally turn the RC page into a Vandal Fighter tool, with all the advantages of being included in the browser.)

Plyd

On 4/27/06, Ivan Krstic krstic@fas.harvard.edu wrote:

...

Plyd wrote:

...
I was wondering how it could be possible to make a real AJAX user

interface

...
without having a strong API.

It's not clear to me that we want a "real AJAX user interface". Things like the editing preview and recent changes are somewhat obvious candidates for a sprinkling of AJAX; which other parts of the user interaction with MW do you think would benefit significantly from it?

Let's not fix things that aren't broken.

-- Ivan Krstic krstic@fas.harvard.edu | GPG: 0x147C722D _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Jay R. Ashworth

11:45 p.m.

On Thu, Apr 27, 2006 at 09:38:12AM -0400, Ivan Krstic wrote:

...

The API will be used by programs, not people.

Just a thought here, but one which is often overlooked...

One of the sets of programs which -- in the long run -- ought to migrate to using the API is... Mediawiki iteself. This thought should shape the design decisions that are made with respect to the API.

Assuming someone can't come up with a compelling reason why this line of reasoning is evidence that I'm full of {myself,crap}. :-)

Cheers, - jra

-- Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274 A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on Usenet and in e-mail?

Ben Francis

28 Apr 28 Apr

12:14 a.m.

Ivan Krstic wrote:

...

Timwi wrote:

...
You can start off with some basic functions (retrieve an article, edit an article, retrieve article history) and later move on to more complex ones.

I like the sound of this, but I'm probably biased because I'm really keen on writing it!

...

The API will be used by programs, not people. People can quickly adapt to changes in the UI, but every mistake that ends up in the API will probably become an incredible PITA that will need to be supported for some time going forward.

I have to admit that this is a very good point, I think an API is something that can be difficult to change once it's written because by definition once it's out there other programs will rely on it staying the same. Getting it right the first time is better than trying to retain backwards compatibility with a badly designed sytem. Writing an API is a big responsibility, especially if it ever gets used with something as hugely significant as Wikipedia.

However...

...

I thought about this some more yesterday, and I've changed my mind. A first pass at a *very* simple API, if done with care, might actually help frame later API discussions, such as the one at Hacking Days.

Yes, I would argue that going ahead and writing a first cut API before the lengthy process of creating a final and stable specification does have an enormous amount of value. Firstly, it gets the ball rolling and gives people something to play with rather than just talk about. But more than that, by actually trying to write it we could learn an awful lot about what needs to go in that final, stable spec. People might try out the prototype API and come back with some useful feedback about how the API could be improved, before the final spec is set in stone. More than merely possible, in my opinion this may actually be a wise thing to do, especially as the final API is likely to stick around for a long time!

I propose brief community discussion and consultation with the people involved in the current clients with a view to writing a quick interim specification. Based on this specification an initial attempt could be started so there is at least some code in repositories somewhere by the time the hacking sessions at Wikimania come around. This initial API could be released as a "beta" system, with no guarantee of supporting it in the future and a disclaimer making it clear that it is an experimental system only.

This beta system should be carefully designed as if it were going to be the final system, the ideal situation would be that it would not need many changes to meet the final specification. Worst case scenario it would have to be largely re-written or even thrown away. That doesn't mean that writing it wasn't enormously worth while.

I suppose this is modern Agile Programming or rapid prototype type thinking, get coding quickly and develop the design in parallel to the development. It sounds counter-intuitive but there's a large amount of evidence to suggest it works better for programming than more traditional engineering methods for tangible systems. It's also something I've tried out this year with a 2 1/2 month University project and it went well.

In my opinion and conveniently for my application ;), I think stating from the start that this code may not ever be officially supported actually makes the Client API a brilliant project for Google Summer of code!

I wouldn't want to work on any project if there was any community resistance to the work, that would be completely counter-productive in an Open Source type environment.

What do people think?

Best Wishes

Ben

-- Ben "tola" Francis http://hippygeek.co.uk

Platonides

2:54 a.m.

Well, make then as first step in the non-official API a query version call. The program myust ask first. Hey! I use mediawiki-beta-7.5 wiki api protocol plus ugly-stuff addition v.1.0.8 The server answers either: -Sorry, API is disabled here. -Sorry, I can only handle mediawiki-beta-7.5.0.0.0.1 Please upgrade. Then the progam must abort and show an alert to the user, forcing the upgrade. -Yes, i understand this, and also japanese and esperanto. Feel free to use ither of them. You can also expect me to continue understanding it (you can cache this response) for the next 3 days.

...

...
The API will be used by programs, not people. People can quickly adapt to changes in the UI, but every mistake that ends up in the API will probably become an incredible PITA that will need to be supported for some time going forward.

I have to admit that this is a very good point, I think an API is something that can be difficult to change once it's written because by definition once it's out there other programs will rely on it staying the same. Getting it right the first time is better than trying to retain backwards compatibility with a badly designed sytem. Writing an API is a big responsibility, especially if it ever gets used with something as hugely significant as Wikipedia.

However...

...
I thought about this some more yesterday, and I've changed my mind. A first pass at a *very* simple API, if done with care, might actually help frame later API discussions, such as the one at Hacking Days.

Yes, I would argue that going ahead and writing a first cut API before the lengthy process of creating a final and stable specification does have an enormous amount of value. Firstly, it gets the ball rolling and gives people something to play with rather than just talk about. But more than that, by actually trying to write it we could learn an awful lot about what needs to go in that final, stable spec. People might try out the prototype API and come back with some useful feedback about how the API could be improved, before the final spec is set in stone. More than merely possible, in my opinion this may actually be a wise thing to do, especially as the final API is likely to stick around for a long time!

I propose brief community discussion and consultation with the people involved in the current clients with a view to writing a quick interim specification. Based on this specification an initial attempt could be started so there is at least some code in repositories somewhere by the time the hacking sessions at Wikimania come around. This initial API could be released as a "beta" system, with no guarantee of supporting it in the future and a disclaimer making it clear that it is an experimental system only.

This beta system should be carefully designed as if it were going to be the final system, the ideal situation would be that it would not need many changes to meet the final specification. Worst case scenario it would have to be largely re-written or even thrown away. That doesn't mean that writing it wasn't enormously worth while.

I suppose this is modern Agile Programming or rapid prototype type thinking, get coding quickly and develop the design in parallel to the development. It sounds counter-intuitive but there's a large amount of evidence to suggest it works better for programming than more traditional engineering methods for tangible systems. It's also something I've tried out this year with a 2 1/2 month University project and it went well.

In my opinion and conveniently for my application ;), I think stating from the start that this code may not ever be officially supported actually makes the Client API a brilliant project for Google Summer of code!

Timwi

29 Apr 29 Apr

5:41 a.m.

...

I have to admit that this is a very good point, I think an API is something that can be difficult to change once it's written because by definition once it's out there other programs will rely on it staying the same. Getting it right the first time is better than trying to retain backwards compatibility with a badly designed sytem. Writing an API is a big responsibility, especially if it ever gets used with something as hugely significant as Wikipedia.

Well, I've been kind of thinking on the following lines: Each MediaWiki wiki administrator - in the case of Wikipedia that would be Wikimedia - can monitor what API calls are still regularly used. It's not really something like HTML where you want backwards compatibility because you don't know how many millions of unmaintained old documents still exist and still want to be read. We do. We can tell if the usage level of an old API function has dropped below a certain threshold, at which case we can decide to stop supporting it. Although we will end up requiring some tool developers to use the new API and some users of unmaintained tools to use other tools, we can be sure that it won't be many. Hence I don't think backwards compatibility is going to be a huge long-term requirement for us.

Timwi

Tels

7:15 a.m.

Moin,

On Saturday 29 April 2006 01:41, Timwi wrote:

...

...
I have to admit that this is a very good point, I think an API is something that can be difficult to change once it's written because by definition once it's out there other programs will rely on it staying the same. Getting it right the first time is better than trying to retain backwards compatibility with a badly designed sytem. Writing an API is a big responsibility, especially if it ever gets used with something as hugely significant as Wikipedia.

Well, I've been kind of thinking on the following lines: Each MediaWiki wiki administrator - in the case of Wikipedia that would be Wikimedia - can monitor what API calls are still regularly used. It's not really something like HTML where you want backwards compatibility because you don't know how many millions of unmaintained old documents still exist and still want to be read. We do. We can tell if the usage level of an old API function has dropped below a certain threshold, at which case we can decide to stop supporting it. Although we will end up requiring some tool developers to use the new API and some users of unmaintained tools to use other tools, we can be sure that it won't be many. Hence I don't think backwards compatibility is going to be a huge long-term requirement for us.

Other wikis will use the same code, thus the same API, and there will be tools around that will work on these wikis on v1.x.y and break on v1.x.z.

So, IMHO backwards compatibility is of the uttermost importance, hence its important to "get it right" the first time.

best wishes,

tels

-- Signed on Sat Apr 29 03:13:53 2006 with key 0x93B84C15. Visit my photo gallery at http://bloodgate.com/photos/ PGP key on http://bloodgate.com/tels.asc or per email. "Carpal Tunnel Syndrome is a non-fatal terminal disease." - Dr. Alexander Fisher

Ben Francis

1 May 1 May

7:04 p.m.

Brion,

You have removed the Client API entry from the Summer of Code page on the wiki with a comment "Removing this since a number of people are actively working on such already".

Who are the number of people who are actively working on this and does this wiki change signify that the Client API project is officially no longer open for applications through Summer of Code? Bearing in mind applications open today.

Yours disappointedly,

Ben

-- Ben "tola" Francis http://hippygeek.co.uk

Yuri Astrakhan

2 May 2 May

9:50 a.m.

Brion and Ben,

User NicoT ( http://meta.wikimedia.org/wiki/User:NicoT ) contacted me about helping to develop the query API as part of Google's SoC. He outlined some of the goals for the API on his page (above), and has been actively looking at the code and communicating over an IM. Even though API might be less important than, lets say rich editor or a new parser code, it should still be considered as a valuable proposition, especially if a value-add applications can be developed on top of basic method calls.

--yurik

On 5/1/06, Ben Francis lists@hippygeek.co.uk wrote:

...

Brion,

You have removed the Client API entry from the Summer of Code page on the wiki with a comment "Removing this since a number of people are actively working on such already".

Who are the number of people who are actively working on this and does this wiki change signify that the Client API project is officially no longer open for applications through Summer of Code? Bearing in mind applications open today.

Yours disappointedly,

Ben

-- Ben "tola" Francis http://hippygeek.co.uk _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Brion Vibber

10:16 a.m.

Yuri Astrakhan wrote:

...

Brion and Ben,

User NicoT ( http://meta.wikimedia.org/wiki/User:NicoT ) contacted me about helping to develop the query API as part of Google's SoC. He outlined some of the goals for the API on his page (above), and has been actively looking at the code and communicating over an IM. Even though API might be less important than, lets say rich editor or a new parser code, it should still be considered as a valuable proposition, especially if a value-add applications can be developed on top of basic method calls.

It's certainly valuable, but I don't think it fits within the Summer of Code framework, which requires the students to do their own work rather than collaborating in groups with the outside.

-- brion vibber (brion @ pobox.com)

Ben Francis

4:46 p.m.

Brion Vibber wrote:

...

It's certainly valuable, but I don't think it fits within the Summer of Code framework, which requires the students to do their own work rather than collaborating in groups with the outside.

The comment on the website states that it has been removed because people are already working on it, here you say it has been removed because you don't think it meets the criteria, which is it?

Is there a final decision on this? There still appears to be confusion over whether this is a Google SoC project or not and we now need to know whether it is worth applying for. We could just apply and watch our applications get ranked out of oblivion, but if there's little point in applying I'd rather know so I can focus my efforts elsewhere :)

I hope I don't come across a little obtuse, I'm just on my way to a lecture so typing this quickly.

Cheers

Ben

-- Ben "tola" Francis http://hippygeek.co.uk

Rob Church

5:24 p.m.

On 02/05/06, Ben Francis lists@hippygeek.co.uk wrote:

...

Brion Vibber wrote:

...
It's certainly valuable, but I don't think it fits within the Summer of Code framework, which requires the students to do their own work rather than collaborating in groups with the outside.

The comment on the website states that it has been removed because people are already working on it, here you say it has been removed because you don't think it meets the criteria, which is it?

Is there a final decision on this? There still appears to be confusion over whether this is a Google SoC project or not and we now need to know whether it is worth applying for. We could just apply and watch our applications get ranked out of oblivion, but if there's little point in applying I'd rather know so I can focus my efforts elsewhere :)

Yes, there are people working on it. No, it's not a suitable SoC project. The rationale for this, I think, is that the API, similar to the formalised parser stuff, is the sort of thing which requires a lot of discussion and collaboration between advanced users, developers, extension writers, etc. and isn't suitable for the sort of working environment where a student codes most of it without much interaction, save for his/her mentor.

Rob Church

Brion Vibber

11:38 p.m.

Ben Francis wrote:

...

Brion Vibber wrote:

...
It's certainly valuable, but I don't think it fits within the Summer of Code framework, which requires the students to do their own work rather than collaborating in groups with the outside.

The comment on the website states that it has been removed because people are already working on it, here you say it has been removed because you don't think it meets the criteria, which is it?

Both. It doesn't meet the criteria because multiple people are working on it, as it's a community effort already underway.

-- brion vibber (brion @ pobox.com)

6785

Age (days ago)

6791

Last active (days ago)

wikitech-l@lists.wikimedia.org

18 comments

11 participants

tags (0)

participants (11)

Ben Francis
Brion Vibber
Ivan Krstic
Jay R. Ashworth
Platonides
Plyd
Rich Morin
Rob Church
Tels
Timwi
Yuri Astrakhan