On Tue, May 12, 2015 at 2:56 PM, James Heald j.heald@ucl.ac.uk wrote:
Hi Brian,
Yes, https://phabricator.wikimedia.org/T98734 was one of the requests; others were https://phabricator.wikimedia.org/T98744 and https://phabricator.wikimedia.org/T98733
These all related to a two-day workshop session organised at the British Library, at significant expense to the BL, to get to grips with GlamWiki Toolset.
The aim on day 1 was to introduce the capabilities of the GWT, Commons, categories etc; and then on day 2 to practically work through some sample image sets.
It was a learning experience for us on the Wikimedia side as well, but one of the very clear messages of the day is the whole set of processes that need to be gone through before one can get started with the GWT appear to be needlessly convoluted, obstructionist and time-wasting.
GWT is supposed to be the preferred path for GLAMs to upload their content, with maximum metadata capture and reliability.
For a major GLAM like the British Library, it ought to be a complete triviality for the head of their Labs group to get their main domain whitelisted. (And actually, even for the smallest minor GLAM while we're on the subject). The song-and-dance they got put through is utterly pointless and self-defeating.
By the end of the second day of the workshop, their main domain *.bl.uk had still not been whitelisted, because somebody took it upon themselves to quibble about the quality of some of the images in a test set.
The domain acms.sl.nsw.gov.au for a visiting participant from the State Library of New South Wales was only cleared at 4pm in the afternoon, as the workshop was closing -- after quibbling about the metadata
Luckily the domain www.jacar.go.jp for some Japanese prints was cleared in time, so there was at least one dataset that the tool could actually be used on (rather than what should have been five). Even that was only because some passing admin had over-ruled an initial quibble.
Apparently:
- getting a domain approved requires the main system config file to be
changed.
- this can only take effect at 4pm or midnight BST
- requests will only be considered on San Francisco time
- any request will get at least one gratuitous knock back.
This is simply not good enough. When the whole of Flickr is wide-open for use by the GW upload tool, this obstructionism is absurd.
We had five very experienced Commons users in the room today, all of them in good standing with thousands of uploads to their name, and none of them could do anything to move the process forward.
This ought to be reviewed urgently, and the GWT whitelist should be divorced from the central system config files as soon as possible, and instead be placed somewhere where any admin can update it with immediate effect, without any of the runaround on Phabricator, which for the GLAM people trying to learn how to do things for themselves at the workshop yesterday and today was an utterly confusing and unnecessary extra complexity.
And the next time somebody asks for content upload to be made possible from one of the great libraries in the world, please everyone let's not waste people's time with petty quibbling.
-- James.
While I can appreciate your frustration, a lot of this could have easily been avoided. None of the phabricator requests noted that their is a deadline for when the request needed to be fulfilled or noted the purpose of the request. Nor did it mention you only needed the url whitelisted on beta labs. What you're calling quibiling to me looks more like people trying to verify the request was legitament (People on projects get really angsty if someone completes a config change request that's not from a legitament source. And it was unclear from the bug who you were and if you were legit)
If the request was something of the form: "Hi, I'm from British library. We are doing a training session between XX:XX-XX:XX UTC on DATE. We plan to upload files from these domain to beta cluster as part of the training. Please add the domains to $wgCopyUploadsDomains".
It would have been processed much faster.
However, that's not to say I'm blaming you. If anything, these issues reflect a failure in the documentation that tell users how they should proceed, and what to expect. It should be clear on http://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset that requests are at the very least going to take a bussiness day, and should ideally be submitted a week in advanced of any large organized event. That requests should contain context on what you plan to be doing so shell users can evaluate the request easier. That in the case of a large event an email to this list saying what is planned and asking if everything is ready to go a couple days before is probably a good idea (I take responsibility for dragging my feet on responding to comments about the issue that was breaking gwtoolset on beta cluster. However if I knew that there was an important use of it on beta cluster planned, I would have made sure to deal with that much earlier. On the other hand, those are the breaks of relying on volunteer labour ;)
But even ignoring that, you're right, the current process is offputting and frustrating. I agree it would be nice if admins could control the whitelist (I think Chris was opposed to this ( https://phabricator.wikimedia.org/T65961#679911 ), but perhaps he could be convinced that the benefits are worth the risk). Perhaps also we should allow gwtoolset testing on test2.wikipedia.org in addition to commons beta, which is more stable (But commons beta for testing does allow us to test early. Most issues found on commons beta would have eventually gone to production unnoticed unless a tester ran into it on commons beta)
Thanks, Brian