Developers at this weekend's GLAMCamp NYC
http://meta.wikimedia.org/wiki/GLAMcamp_NYC
are developing a data-munging tool, based on pywikipediabot, to aid in mass uploads. They'll be hacking on it in sprints this weekend, starting 11am-12:30pm NYC time tomorrow, Saturday the 21st. Join them in person, or in #glamwiki on Freenode.
See notes from today's preliminary session:
http://etherpad.wikimedia.org/GLAMcampNYC-ut
Summary, I believe by Maarten Zeinstra:
There is a Python library (pywikipediabot) where many of Maarten's bots are derived from[1]. The desired outcome of this session is that we're turning ita library that functions as a black box for uploading data to Wikimedia Commons. This library will have one external function "put(metadata, configuration). It also needs to include a function to check for duplicates.
configuration wll be a dictionary that holds the following keys:
- configurationTemplate, holds URL to configuration template
- configurationTitleTemplate, holds configuration of Title Template
- sourceKey, holds the key of the metadata dict that indicate the url
of the source
An extra module will be written to ingest different formats and offer its metadata as dictionary in key-value format (metadata in put(); ). This module can be GUI-ed. Written as a base class with can be subclassed or extended, to enable different standards.
[1] https://fisheye.toolserver.org/browse/multichill/bot/Europeana/Europeana_upl...
The Etherpad also describes the work they're going to do:
We are going to make 3 modules
- Upload module
MaartenD is going to make a function put(metadata, configuration); with metadata as a dict (python for associative array) add duplicate checker
- Conversion module / interface module
Make a metadata conversion module that ingests CVSs/OAI-PMH and converts to internal dict format add a GUI that create the 2 dicts necessary (metadata, configuration)
- Develop a configuration standard as an array of keys for a dict.
draft of standard: configurationTemplate: holds template url ConfigurationTitleTemplate
Hope this is a productive weekend and we come out of it with something useful!
Sumana Harihareswara Wikimedia Foundation Volunteer Development Coordinator