Magnus' blog post today is a must-read for anyone in the GLAM community who wants to understand why analytics has been such a challenge for us: http://magnusmanske.de/wordpress/?p=173
Quoting the last three paragraphs in full:
[...]
Like others, I have tried to get the Foundation to provide the page view
data in a more accessible and local (as in toolserver/Labs) way. Like others, I failed. The last iteration was a video meeting with the Analytics team (newly restarted, as the previous Analytics team didn't really work out for a reason; I didn't inquire too deeply), which ended with a promise to get this done Real Soon Now(tm), and the generous offer to use the page view data from their hadoop cluster. Except the cluster turned out to be empty; I then was encouraged to import the view data myself. (No, this is not a joke. I have the emails to prove it.) As much as I enjoy working with and around the Wikiverse, I do have neither the time, the bandwidth, nor the inclination to do your paid jobs for you, thank you very much.
As the sophisticated reader might have picked up at this point, the entire
topic is rather frustrating for myself and others, and being unable to offer a patchy, error-prone data set to GLAMs who have released hundreds of thousands of files under a free license into Commons is, quite frankly, disgraceful. The requirement for the Foundation is not unreasonable; providing what Henrik has been doing for years on his own would be quite sufficient. Not even that is required; myself and others have volunteered to write interfaces if the back-end data is provided in a usable form.
Of the tools I try to provide in the GLAM realm, some don't really work at
the moment due to the constraints described above; some work so-so, kept running with a significant amount of manual fixing. Adding 100.000 Wellcome Trust images may be enough for them to come to a grinding halt. And when all the institutions who so graciously have contributed free content to the Wikiverse come a-running, I will make it perfectly clear that there is only the Foundation to blame.
And as many of have said before, thanks to Magnus for doing as much as he has been able to within current circumstances.
Dominic
Thanks for sharing this Dominic. It would be interesting to chat about this sometime. I’ve run into the same issue with relying on stats.grok.se in the past with Linkypedia. It became untenable so I stopped showing article view stats.
I wonder if it would be worthwhile to get together (Skype, hangout?) to chat about the current state of the analytics hadoop cluster, and how we might collectively push the Foundation in the right direction? In my few conversations with them they seemed focused on building out general tools, rather than specific (and very useful) tools like stats.grok.se. But we should be able to right that ship no?
I know that the Europeana project are looking to collect statistics as part of their Wiki GLAM Toolset project [1,2]. It might be good to rope some of them into the conversation if Magnus isn’t already connected with them.
//Ed
[1] http://pro.europeana.eu/web/guest/pro-blog/-/blogs/europeana-wiki-glam-tools... [2] https://commons.wikimedia.org/wiki/Commons:GLAMToolset_project
On Feb 17, 2014, at 7:22 PM, Dominic McDevitt-Parks mcdevitd@gmail.com wrote:
Magnus' blog post today is a must-read for anyone in the GLAM community who wants to understand why analytics has been such a challenge for us: http://magnusmanske.de/wordpress/?p=173
Quoting the last three paragraphs in full:
[...]
Like others, I have tried to get the Foundation to provide the page view data in a more accessible and local (as in toolserver/Labs) way. Like others, I failed. The last iteration was a video meeting with the Analytics team (newly restarted, as the previous Analytics team didn’t really work out for a reason; I didn’t inquire too deeply), which ended with a promise to get this done Real Soon Now™, and the generous offer to use the page view data from their hadoop cluster. Except the cluster turned out to be empty; I then was encouraged to import the view data myself. (No, this is not a joke. I have the emails to prove it.) As much as I enjoy working with and around the Wikiverse, I do have neither the time, the bandwidth, nor the inclination to do your paid jobs for you, thank you very much.
As the sophisticated reader might have picked up at this point, the entire topic is rather frustrating for myself and others, and being unable to offer a patchy, error-prone data set to GLAMs who have released hundreds of thousands of files under a free license into Commons is, quite frankly, disgraceful. The requirement for the Foundation is not unreasonable; providing what Henrik has been doing for years on his own would be quite sufficient. Not even that is required; myself and others have volunteered to write interfaces if the back-end data is provided in a usable form.
Of the tools I try to provide in the GLAM realm, some don’t really work at the moment due to the constraints described above; some work so-so, kept running with a significant amount of manual fixing. Adding 100.000 Wellcome Trust images may be enough for them to come to a grinding halt. And when all the institutions who so graciously have contributed free content to the Wikiverse come a-running, I will make it perfectly clear that there is only the Foundation to blame.
And as many of have said before, thanks to Magnus for doing as much as he has been able to within current circumstances.
Dominic _______________________________________________ GLAM mailing list GLAM@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glam