context
-------
i’m working on a mediawiki extension,
http://www.mediawiki.org/wiki/Extension:GWToolset, which has as one of its
goals, the ability to upload media files to a wiki. the extension, among
other tasks, will process an XML file that has a list of urls to media
files and upload those media files to the wiki along with metadata
contained within the XML file. our ideal goal is to have this extension run
on
http://commons.wikimedia.org/ <onhttp://commons.wikimedia.org/>.
background
----------
h
ttp://commons.wikimedia.org/wiki/Commons:GLAMToolset_project/Request_for_Co…
Metadata Set Repo
-----------------
one of the goals of the project is to store Metadata Sets, such as XML
under some type of version control. those Metadata Sets need to be
accessible so that the extension can grab the content from it and process
it. processing involves iterating over the entire Metadata Set and creating
Jobs for the Job Queue which will upload each individual media file and
metadata into a media file page using a Mediawiki template format, such as
Artwork.
some initial requirements
• File sizes
• can range from a few kilobytes to several megabytes.
• max file-size is 100mb.
• XML Schema - not required.
• XML DTD - not required.
• When metadata is in XML format, each record must consist of a single
parent with many child
• XML attribute lang= is the only one currently used and without user
interaction
• There is no need to display the Metadata sets in the wiki.
• There is no need to edit the Metadata sets in the wiki.
we initially developed the extension to store the files in the File:
namespace, but we were told by the Foundation that we should use
ContentHandler instead. unfortunately there is an issue with storing
content > 1mb in the db so we need to find another solution.
1. any suggestions?
Mapping
-------
a mapping is a json that maps a metadata set to a mediawiki template. we’re
currently storing those as Content in the namespace GWToolset. an entry
might be in GWToolset:Metadata_Mappings/Dan-nl/Rijkmuseum.
1. does that namespace make sense?
a. if not, what namespace would you recommend?
2. does this concept make sense?
a. if not, what would you recommend?
Maintaining Original Metadata Snippet & Mapping
-----------------------------------------------
another goal is to link or somehow connect the original metadata used to
create the mediafile:
• metadata set
• metadata snippet
• metadata mapping
the current thought is to insert these items as comments within the wiki
text of the media file page
1. does that make sense?
a. if not, what would you recommend doing?
2. is there a better way to do this?
mediawiki template parameters
-----------------------------
the application needs to know what mediawiki template parameters exist and
are available to use for mapping media file metadata to the mediawiki
templates. for the moment we are hard-coding these parameters in a db table
and sometimes in the code. this is not ideal. i have briefly seen
TemplateData, but haven’t had enough time to see if it would address our
needs.
1. is there a way to programatically discover the available parameters for
a mediawiki template?
thanks in advance for your help!
dan