On Fri, Jan 30, 2009 at 12:55 AM, Brianna Laugher
<brianna.laugher(a)gmail.com> wrote:
2009/1/30 Johannes Beigel
<johannes.beigel(a)pediapress.com>om>:
On 29.01.2009, at 13:48, Brianna Laugher wrote:
On Wikimedia Commons a little bit of work has
been done to this end:
<http://commons.wikimedia.org/wiki/Commons:Commons_API>
We've been aware of this page and Magnus' implementation, and we think
it looks really good!
The information is (AFAIK) scraped from the rendered XHTML of
articles. This could be done in a less error-prone way (and more
efficiently) if the data would be stored and accessed via database in
some way. Of course this would require some discussion, formal
decisions and code changes. But as I stated in an earlier post: I
think MediaWiki is so widely used by people who want to share and
collaborate on free content, that it's not too farfetched to build
some "license infrastracture" into the software itself.
I agree that it makes a lot of sense. But because it would be a big
change, I fear that unless the lead developers show great enthusiasm
for the idea, it will take a very long time to be accepted and
completed. Whereas building an "add-on" tool can be faster to get to
point of functionality.
It may be a good idea to try and build the Commons API to mimic the
MediaWiki API, imagining that in the future such information will be
available via that. So then hopefully for now people could use the
Commons API, and in the future switch to the MediaWiki API by just
changing the API URL, and all their queries could stay the same.
There is a big conceptual difference between the two APIs, IMHO. The
MediaWiki API can be used to query technically defined things: Link
lists, categories, template usage and so on. A Commons API (mine or
someone elses) parses the content itself for data and relations that
are not technically defined.
One way would be to add some kind of license metadata per page into
the database. This is possible, but rather specific; also, it would
likely mean to create a separate interface just for that.
The better way (IMHO) is to store all used
"page:template:parameter:value" tuples in a wiki in a separate
database table, which could be queried by the MediaWiki API. This has
been suggested time and again by me and others. It would then be much
easier for a third-party API to get the relevant data for a page. The
functionality is part of Semantic Wikimedia, but would actually scale
as a project on its own ;-)
This approach would also aloow for the integration of tools like
TemplateTiger [1] directly into Wikipedia.
Magnus
[1]
http://toolserver.org/~kolossos/templatetiger/tt-table4.php?template=Person…