Hi Primavera,
I'm no admin in Commons,
but I think I can say few things.
As you know, Commons, as the all other Wikimedia projects,
is based on MediaWiki, which cannot manage metadata properly (i.e. in
dedicated databases).
So far, we just have parsers and bots which read our templates (in
Wikipedia as in Commons)
to retrieve some metadata. Another way is to work directly with the
dumps (it happens with DBpedia, I think).
As far as I know, it has been discussed *many* times to upgrade/shift to
a more granular, structured system, but right now there is not.
There are some extensions in MediaWiki that could help (as Semantic MediaWiki),
but they are not implemented for security reasons.
People interested in GLAM partnerships (Galleries, Libraries,
Archives, Museums) discuss often about the need of managing metadata,
but it's a big and not easy issue, involving the very core of
MediaWiki.
(I'm the guy obsessed in having an OAI-MPH extension for MediaWiki, so
I understand you perfectly :-)
So, the only thing you could do is ask our developers/tecchies for
bots and other fancy script
which currently are doing a similar job.
Aubrey
PS: to Commons admins: please corret me if I siad something wrong, but
this is the picture I have.
2011/8/19 Primavera De Filippi <pdefilippi(a)gmail.com>om>:
Hi all,
I write to you on behalf of the public domain working grouf of the
Open Knowledge Foundation. We are currently developing an automated
system to identify the legal status of different types of works (i.e.
to determine whether or not they are in public domain). In order to do
this, we need to gather the necessary metadata to determine the legal
status of these works. This includes information such as title,
author, date of publication, etc.
You can find more information about the project on our site
http://publicdomain.okfn.org/calculators.
A preliminary implementation of the project can be seen at
www.publicdomainworks.net (site still under development).
Incorporating the metadata from the Wikimedia Commons archive into our
database would be extremely useful both for us, since it would greatly
increase the quality of our results.
eg. in the case of
http://commons.wikimedia.org/wiki/File:Cyphoma_signatum_(Fingerprint_Cowry_…
- we would like to retrieve the information from the Summary section
If I understood correctly, the metadata regarding the works of the
archive is primarily text/html based.
Hence, I would like to know (a) whether there exists a database where
this metadata can be retrieved, or alternatively (b) whether would you
be interested in switching to a more structured database contained all
the relevant metadata about those works?
Looking forward to your answer,
Primavera
_______________________________________________
Commons-l mailing list
Commons-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l