Xmldatadumps-l November 2010

xmldatadumps-l@lists.wikimedia.org

8 participants
10 discussions

Re: [Xmldatadumps-l] Mirroring Wikimedia project XML dumps
by emijrp 16 Nov '10

16 Nov '10

Thanks, Paolo. I hadn't added archive.org because it requires us to upload the files, and the files cannot be deleted in the future, so, upload the latest dumps (updated every month or so) could be a waste of resources for them. I was thinking about mirrors who runs wget to slurp all the files from download.wikimedia.org, and the next month delete the previous ones. Of course, we can contact to archive.org to ask them about the wget idea. Regards, emijrp 2010/11/16 paolo massa <paolo(a)gnuband.org> > I've added archive.org ;) > > On Tue, Nov 16, 2010 at 12:05 PM, emijrp <emijrp(a)gmail.com> wrote: > > Hi all; > > > > I have started a new page in meta: for coordinating the efforts in > mirroring > > Wikimedia project XML dumps. I asked some days ago to iBiblio if they > were > > interested in this, but they replied: "Unfortunately, we do not have the > > resources to provide a mirror of wikipedia. Best of luck!" > > > > I think that we must work on this, so, all the help is welcome. If you > know > > about universities, archives, etc, that could be interested in get a copy > of > > the XML files, for backup or research purposes, please, add them to the > list > > and we can send them a letter. > > > > We are compiling all the human knowledge! That deserves being mirroring > ad > > nauseam! > > > > Regards, > > emijrp > > > > [1] > > > https://secure.wikimedia.org/wikipedia/meta/wiki/Mirroring_Wikimedia_projec… > > > > _______________________________________________ > > Xmldatadumps-l mailing list > > Xmldatadumps-l(a)lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l > > > > > > > > -- > -- > Paolo Massa > Email: paolo AT gnuband DOT org > Blog: http://gnuband.org >

2 1

Mirroring Wikimedia project XML dumps
by emijrp 16 Nov '10

16 Nov '10

Hi all; I have started a new page in meta: for coordinating the efforts in mirroring Wikimedia project XML dumps. I asked some days ago to iBiblio if they were interested in this, but they replied: *"Unfortunately, we do not have the resources to provide a mirror of wikipedia. Best of luck!" *I think that we must work on this, so, all the help is welcome. If you know about universities, archives, etc, that could be interested in get a copy of the XML files, for backup or research purposes, please, add them to the list and we can send them a letter. We are compiling all the human knowledge! That deserves being mirroring ad nauseam! Regards, emijrp [1] https://secure.wikimedia.org/wikipedia/meta/wiki/Mirroring_Wikimedia_projec…

1 0

Re: [Xmldatadumps-l] Temporary dump mirror locations?
by Diederik van Liere 15 Nov '10

15 Nov '10

I was thinking of the last 3 - 4 month, thus July - October 2010. On 2010-11-15, at 10:30 AM, emijrp wrote: > Define recent. > > 2010/11/12 Diederik van Liere <dvanliere(a)gmail.com> > Hi, > > As we all know, download.wikimedia.org is temporarily offline. Does > somebody have a recent stub-meta-history.xml available (any language > is okay)? > > Best regards, > > > Diederik > > _______________________________________________ > Xmldatadumps-l mailing list > Xmldatadumps-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l >

3 5

Temporary dump mirror locations?
by Diederik van Liere 12 Nov '10

12 Nov '10

Hi, As we all know, download.wikimedia.org is temporarily offline. Does somebody have a recent stub-meta-history.xml available (any language is okay)? Best regards, Diederik

1 0

How to use datadumps
by xiang wang 11 Nov '10

11 Nov '10

Hello all: I'm going to construct an ontology database with wikipedia. What I want to do is importing datadumps into a database and then extract knowledge from the databse. But I find a problem .As you know that Chinese contains Simplified Chinese and Traditional Chinese. When I check the data in the dumps, I find both Simplified Chinese and Traditional Chinese mixes together. I don't know how to convert Traditional Chinese to Simplified Chinese. Is that possible I use the datadumps to construct my ontology database? The datadumps I download is "zhwiki-20101014". Thanks! David

1 0

XML dumps stopped, possible fs/disk issues on dump server under investigation
by Ariel T. Glenn 10 Nov '10

10 Nov '10

We noticed a kernel panic message and stack trace in the logs on the server that servers XML dumps. The web server that provides access to these files is temporarily out of commission; we hope to have it back on line in 12 hours or less. Dumps themselves have been suspended while we investigate. I hope to have an update on this tomorrow as well. Ariel

2 2

nlwiktionary dump progress on 20101106 very slow
by Andreas Meier 06 Nov '10

06 Nov '10

Hello, see http://download.wikipedia.org/nlwiktionary/20101106/ Best regards Andreas

1 0

How to use the "Database backup dumps"
by xiang wang 05 Nov '10

05 Nov '10

I have download the "Database backup dumps" of chinese Edition. There are files with XML and sql format. I want to have data all in database like MySQL. Can I get this data (especially the XML format) to MySQL database without using MediaWiki? How to do this if possible? Where can I get the format details of each dump? Because I have read contents in "zhwiki-20101014-pages-articles.xml" , but chiness have two eddition: "Simplified Chinese" and “Traditional Chinese”. Both format exits raffertily In file "zhwiki-20101014-pages-articles.xml" . I don't known how to get rid it. Thanks!

2 3

Saludos
by jcms 05 Nov '10

05 Nov '10

Y Mucho amor -- Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistem a Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas Infomed: http://www.sld.cu/

1 0

Statistics about downloads of dumps
by emijrp 01 Nov '10

01 Nov '10

Hi all; Are there statistics about how many people download the dumps? Not only the hits, also the completed downloads (is it possible?), if not, the wasted bandwidth would be a good measure. Regards, emijrp

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Xmldatadumps-l November 2010