Hi everyone,
As you can see on top of https://meta.wikimedia.org/wiki/Data_dumps, WMF is actively looking for help archiving and distributing data dumps. It would be great if you could check with the institutions you are associated with if they have available storage and bandwidth to donate. It would make it easier to keep better dump archives and improve downlaod speed. You can see more about the requirements herehttp://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Requirements. Ariel Glenn(in CC) will be happy to help anyone willing to host a mirror.
Thank you.
Mariya
What's wrong with using archive.org?
On Mon, Feb 25, 2013 at 8:35 AM, Maria Miteva mariya.miteva@gmail.com wrote:
Hi everyone,
As you can see on top of https://meta.wikimedia.org/wiki/Data_dumps, WMF is actively looking for help archiving and distributing data dumps. It would be great if you could check with the institutions you are associated with if they have available storage and bandwidth to donate. It would make it easier to keep better dump archives and improve downlaod speed. You can see more about the requirements here. Ariel Glenn(in CC) will be happy to help anyone willing to host a mirror.
Thank you.
Mariya
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
This is quite a timely topic, as I had just been about to send out a message about our Australian plans .
As part of our 2013 Annual Plan, Wikimedia Australia has put aside some funds towards hosting these dumps in Australia. We are already in conversation with one of the large research high-performance computing facility and some other partners about setting this up. Funds permitting, our long-term vision is to not only host the dumps in raw form but also to process the dumps in various ways to make it easier for researchers to work with and for us at Wikimedia Australia to be more able to extract KPIs and create an evidence base in terms of impacts our initiatives have on Australian editors and Australian content. And there should be high-performance support for those researchers wishing to do complex analysis of the data.
To that end, I would like to create a mailing list of Australian researchers who either have used or wish to use these data dumps. This is partly because I need to show that there is a community of researchers to justify the initial resource allocation and partly because I need input on the ways we might pre-process the dumps to make them most accessible for research purposes and finally because we might need to work together on some funding applications (e.g. ARC LIEF, NECTAR, etc) to realise the full vision.
So, if you are a researcher in Australia, please let me know you are and what you have done or hope to do with Wikipedia data. Feel free to circulate to interested colleagues.
Thanks
Kerry
_____
From: wiki-research-l-bounces@lists.wikimedia.org [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of Maria Miteva Sent: Monday, 25 February 2013 11:35 PM To: wiki-research-l@lists.wikimedia.org Cc: Ariel Glenn WMF Subject: [Wiki-research-l] Looking for mirrors for Data dumps
Hi everyone,
As you can see on top of https://meta.wikimedia.org/wiki/Data_dumps, WMF is actively looking for help archiving and distributing data dumps. It would be great if you could check with the institutions you are associated with if they have available storage and bandwidth to donate. It would make it easier to keep better dump archives and improve downlaod speed. You can see more about the requirements here http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Requir ements . Ariel Glenn(in CC) will be happy to help anyone willing to host a mirror.
Thank you.
Mariya
Hi,
Anthony, what do you mean what's wrong with archive.org?
All the current mirrors http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Current_Mirrors are located in the Americas and Europe. It would be great for users around the world to have geographically closer mirrors likely to offer better bandwidth. I am really happy to hear about Wikimedia Australia working on hosting the dumps. Get in touch with Ariel when you need help with that.
Mariya
On Mon, Feb 25, 2013 at 10:39 PM, Kerry Raymond kerry.raymond@gmail.comwrote:
This is quite a timely topic, as I had just been about to send out a message about our Australian plans …****
As part of our 2013 Annual Plan, Wikimedia **Australia** has put aside some funds towards hosting these dumps in ****Australia****. We are already in conversation with one of the large research high-performance computing facility and some other partners about setting this up. Funds permitting, our long-term vision is to not only host the dumps in raw form but also to process the dumps in various ways to make it easier for researchers to work with and for us at Wikimedia Australia to be more able to extract KPIs and create an evidence base in terms of impacts our initiatives have on Australian editors and Australian content. And there should be high-performance support for those researchers wishing to do complex analysis of the data. ****
To that end, I would like to create a mailing list of Australian researchers who either have used or wish to use these data dumps. This is partly because I need to show that there is a community of researchers to justify the initial resource allocation and partly because I need input on the ways we might pre-process the dumps to make them most accessible for research purposes and finally because we might need to work together on some funding applications (e.g. ARC LIEF, NECTAR, etc) to realise the full vision.****
So, if you are a researcher in ****Australia****, please let me know you are and what you have done or hope to do with Wikipedia data. Feel free to circulate to interested colleagues. ****
Thanks****
Kerry****
*From:* wiki-research-l-bounces@lists.wikimedia.org [mailto: wiki-research-l-bounces@lists.wikimedia.org] *On Behalf Of *Maria Miteva *Sent:* Monday, 25 February 2013 11:35 PM *To:* wiki-research-l@lists.wikimedia.org *Cc:* Ariel Glenn WMF *Subject:* [Wiki-research-l] Looking for mirrors for Data dumps****
Hi everyone, ****
As you can see on top of https://meta.wikimedia.org/wiki/Data_dumps, WMF is actively looking for help archiving and distributing data dumps. It would be great if you could check with the institutions you are associated with if they have available storage and bandwidth to donate. It would make it easier to keep better dump archives and improve downlaod speed. You can see more about the requirements herehttp://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Requirements. Ariel Glenn(in CC) will be happy to help anyone willing to host a mirror.
Thank you. ****
Mariya****
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
On Wed, Feb 27, 2013 at 8:14 AM, Mariya Nedelcheva Miteva mariya.miteva@gmail.com wrote:
Anthony, what do you mean what's wrong with archive.org?
Why aren't the dumps being uploaded to archive.org?
(Maybe the answer is that they are, and I just didn't know about it, though.)
Yes they are (or were). I am unofficially in charge of that, though I stopped the uploading due to issues with Labs. I am currently only archiving special files and the daily incremental dumps.
All the stuff are all available inside the "wikimediadownloads" collection[1].
[1]: http://archive.org/details/wikimediadownloads
On Thu, Feb 28, 2013 at 8:26 AM, Anthony wikimail@inbox.org wrote:
On Wed, Feb 27, 2013 at 8:14 AM, Mariya Nedelcheva Miteva mariya.miteva@gmail.com wrote:
Anthony, what do you mean what's wrong with archive.org?
Why aren't the dumps being uploaded to archive.org?
(Maybe the answer is that they are, and I just didn't know about it, though.)
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org