On Tue, May 12, 2015 at 2:56 PM, James Heald <j.heald(a)ucl.ac.uk> wrote:
Hi Brian,
Yes,
https://phabricator.wikimedia.org/T98734 was one of the requests;
others were
https://phabricator.wikimedia.org/T98744 and
https://phabricator.wikimedia.org/T98733
These all related to a two-day workshop session organised at the British
Library, at significant expense to the BL, to get to grips with GlamWiki
Toolset.
The aim on day 1 was to introduce the capabilities of the GWT, Commons,
categories etc; and then on day 2 to practically work through some sample
image sets.
It was a learning experience for us on the Wikimedia side as well, but one
of the very clear messages of the day is the whole set of processes that
need to be gone through before one can get started with the GWT appear to be
needlessly convoluted, obstructionist and time-wasting.
GWT is supposed to be the preferred path for GLAMs to upload their content,
with maximum metadata capture and reliability.
For a major GLAM like the British Library, it ought to be a complete
triviality for the head of their Labs group to get their main domain
whitelisted. (And actually, even for the smallest minor GLAM while we're on
the subject). The song-and-dance they got put through is utterly pointless
and self-defeating.
By the end of the second day of the workshop, their main domain
*.bl.uk had still not been whitelisted, because somebody took it upon
themselves to quibble about the quality of some of the images in a test set.
The domain acms.sl.nsw.gov.au for a visiting participant from the State
Library of New South Wales was only cleared at 4pm in the afternoon, as the
workshop was closing -- after quibbling about the metadata
Luckily the domain
www.jacar.go.jp for some Japanese prints was cleared in
time, so there was at least one dataset that the tool could actually be used
on (rather than what should have been five). Even that was only because
some passing admin had over-ruled an initial quibble.
Apparently:
* getting a domain approved requires the main system config file to be
changed.
* this can only take effect at 4pm or midnight BST
* requests will only be considered on San Francisco time
* any request will get at least one gratuitous knock back.
This is simply not good enough. When the whole of Flickr is wide-open for
use by the GW upload tool, this obstructionism is absurd.
We had five very experienced Commons users in the room today, all of them in
good standing with thousands of uploads to their name, and none of them
could do anything to move the process forward.
This ought to be reviewed urgently, and the GWT whitelist should be divorced
from the central system config files as soon as possible, and instead be
placed somewhere where any admin can update it with immediate effect,
without any of the runaround on Phabricator, which for the GLAM people
trying to learn how to do things for themselves at the workshop yesterday
and today was an utterly confusing and unnecessary extra complexity.
And the next time somebody asks for content upload to be made possible from
one of the great libraries in the world, please everyone let's not waste
people's time with petty quibbling.
-- James.
While I can appreciate your frustration, a lot of this could have
easily been avoided. None of the phabricator requests noted that their
is a deadline for when the request needed to be fulfilled or noted the
purpose of the request. Nor did it mention you only needed the url
whitelisted on beta labs. What you're calling quibiling to me looks
more like people trying to verify the request was legitament (People
on projects get really angsty if someone completes a config change
request that's not from a legitament source. And it was unclear from
the bug who you were and if you were legit)
If the request was something of the form:
"Hi, I'm from British library. We are doing a training session between
XX:XX-XX:XX UTC on DATE. We plan to upload files from these domain to
beta cluster as part of the training. Please add the domains to
$wgCopyUploadsDomains".
It would have been processed much faster.
However, that's not to say I'm blaming you. If anything, these issues
reflect a failure in the documentation that tell users how they should
proceed, and what to expect. It should be clear on
http://commons.wikimedia.org/wiki/Commons:GLAMwiki_Toolset that
requests are at the very least going to take a bussiness day, and
should ideally be submitted a week in advanced of any large organized
event. That requests should contain context on what you plan to be
doing so shell users can evaluate the request easier. That in the case
of a large event an email to this list saying what is planned and
asking if everything is ready to go a couple days before is probably a
good idea (I take responsibility for dragging my feet on responding to
comments about the issue that was breaking gwtoolset on beta cluster.
However if I knew that there was an important use of it on beta
cluster planned, I would have made sure to deal with that much
earlier. On the other hand, those are the breaks of relying on
volunteer labour ;)
But even ignoring that, you're right, the current process is
offputting and frustrating. I agree it would be nice if admins could
control the whitelist (I think Chris was opposed to this (
https://phabricator.wikimedia.org/T65961#679911 ), but perhaps he
could be convinced that the benefits are worth the risk). Perhaps also
we should allow gwtoolset testing on
test2.wikipedia.org in addition
to commons beta, which is more stable (But commons beta for testing
does allow us to test early. Most issues found on commons beta would
have eventually gone to production unnoticed unless a tester ran into
it on commons beta)
Thanks,
Brian