context ------- i’m working on a mediawiki extension, http://www.mediawiki.org/wiki/Extension:GWToolset, which has as one of its goals, the ability to upload media files to a wiki. the extension, among other tasks, will process an XML file that has a list of urls to media files and upload those media files to the wiki along with metadata contained within the XML file. our ideal goal is to have this extension run on http://commons.wikimedia.org/ onhttp://commons.wikimedia.org/.
background ---------- h ttp://commons.wikimedia.org/wiki/Commons:GLAMToolset_project/Request_for_Comments/Technical_Architecturehttp://commons.wikimedia.org/wiki/Commons:GLAMToolset_project/Request_for_Comments/Technical_Architecture
Metadata Set Repo ----------------- one of the goals of the project is to store Metadata Sets, such as XML under some type of version control. those Metadata Sets need to be accessible so that the extension can grab the content from it and process it. processing involves iterating over the entire Metadata Set and creating Jobs for the Job Queue which will upload each individual media file and metadata into a media file page using a Mediawiki template format, such as Artwork.
some initial requirements • File sizes • can range from a few kilobytes to several megabytes. • max file-size is 100mb.
• XML Schema - not required. • XML DTD - not required.
• When metadata is in XML format, each record must consist of a single parent with many child • XML attribute lang= is the only one currently used and without user interaction
• There is no need to display the Metadata sets in the wiki. • There is no need to edit the Metadata sets in the wiki.
we initially developed the extension to store the files in the File: namespace, but we were told by the Foundation that we should use ContentHandler instead. unfortunately there is an issue with storing content > 1mb in the db so we need to find another solution.
1. any suggestions?
Mapping ------- a mapping is a json that maps a metadata set to a mediawiki template. we’re currently storing those as Content in the namespace GWToolset. an entry might be in GWToolset:Metadata_Mappings/Dan-nl/Rijkmuseum.
1. does that namespace make sense? a. if not, what namespace would you recommend?
2. does this concept make sense? a. if not, what would you recommend?
Maintaining Original Metadata Snippet & Mapping ----------------------------------------------- another goal is to link or somehow connect the original metadata used to create the mediafile:
• metadata set • metadata snippet • metadata mapping
the current thought is to insert these items as comments within the wiki text of the media file page
1. does that make sense? a. if not, what would you recommend doing?
2. is there a better way to do this?
mediawiki template parameters ----------------------------- the application needs to know what mediawiki template parameters exist and are available to use for mapping media file metadata to the mediawiki templates. for the moment we are hard-coding these parameters in a db table and sometimes in the code. this is not ideal. i have briefly seen TemplateData, but haven’t had enough time to see if it would address our needs.
1. is there a way to programatically discover the available parameters for a mediawiki template?
thanks in advance for your help! dan