[Commons-l] Commons API

Brianna Laugher brianna.laugher at gmail.com
Sun Mar 30 14:10:07 UTC 2008


There is an interesting Firefox extension called Zemanta, that works
with some blogging platforms, to suggest images to match a blog post
you type. One of the sources they use is Commons.
See this post (comments) for a description of how it works and what
it's lacking:

In particular,
"If you have an idea how to correctly capture wikipedia images
attribution (something that would assure at least 50% correct coverage
from 2.8M images), please help us! ;)"

Really, we can't blame people too much for not providing attribution,
when we don't give that information in a standard way, or give  a
standard way of accessing it.

Now is as good a time as any to formally write an API to recommend for
other people to use. Aside from the MediaWiki API, there are three
main things I can think of that are often needed to be automated:
* identify any "problem tags" (files with deletion markers shouldn't
be used or indexed by third parties)
* extract license name(s) and URL for a given file
* extract author attribution string for a given file

So I propose we put our heads together and figure out the most robust
algorithm for each of these, and provide some sample code for each.

I made a start here:


Contributions and feedback welcome...


They've just been waiting in a mountain for the right moment:

More information about the Commons-l mailing list