This is quite a timely topic, as I had just been about to send out a message about our Australian plans …

 

As part of our 2013 Annual Plan, Wikimedia Australia has put aside some funds towards hosting these dumps in Australia. We are already in conversation with one of the large research high-performance computing facility and some other partners about setting this up. Funds permitting, our long-term vision is to not only host the dumps in raw form but also to process the dumps in various ways to make it easier for researchers to work with and for us at Wikimedia Australia to be more able to extract KPIs and create an evidence base in terms of impacts our initiatives have on Australian editors and Australian content. And there should be high-performance support for those researchers wishing to do complex analysis of the data.

 

To that end, I would like to create a mailing list of Australian researchers who either have used or wish to use these data dumps. This is partly because I need to show that there is a community of researchers to justify the initial resource allocation and partly because I need input on the ways we might pre-process the dumps to make them most accessible for research purposes and finally because we might need to work together on some funding applications (e.g. ARC LIEF, NECTAR, etc) to realise the full vision.

 

So, if you are a researcher in Australia, please let me know you are and what you have done or hope to do with Wikipedia data. Feel free to circulate to interested colleagues.

 

Thanks

 

Kerry

 

 


From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Maria Miteva
Sent: Monday, 25 February 2013 11:35 PM
To: wiki-research-l@lists.wikimedia.org
Cc: Ariel Glenn WMF
Subject: [Wiki-research-l] Looking for mirrors for Data dumps

 

 Hi everyone, 

 

As you can see on top of https://meta.wikimedia.org/wiki/Data_dumps, WMF is actively looking for help archiving and distributing data dumps.  It would be great if you could check with the institutions you are associated with if they have available storage and bandwidth to donate. It would make it easier to keep better dump archives and improve downlaod speed. You can see more about the requirements here. Ariel Glenn(in CC) will be happy to help anyone willing to host a mirror. 

 

Thank you. 

 

Mariya