Hi all, I was just time travelling at http://dumps.wikimedia.org/enwiki/ and the oldest dump I could find was: 20110526.
I am building an evaluation dataset that needs the state of Wikipedia in 2010. Is there a way I could get my hands on a 2010 version of pages-articles.xml.bz2 ?
Thank you, Pablo
PS: for anybody interested in navigating wikipedia's history nicely: https://bugzilla.wikimedia.org/show_bug.cgi?id=34778
Why don't you use pages-meta-history? It should contain all the information you want, no matter when dumps were made.
Petr Onderka [[en:User:Svick]]
On Mon, Jun 4, 2012 at 7:15 PM, Pablo Mendes pablomendes@gmail.com wrote:
Hi all, I was just time travelling at http://dumps.wikimedia.org/enwiki/%C2%A0and the oldest dump I could find was: 20110526.
I am building an evaluation dataset that needs the state of Wikipedia in 2010. Is there a way I could get my hands on a 2010 version of pages-articles.xml.bz2 ?
Thank you, Pablo
PS: for anybody interested in navigating wikipedia's history nicely: https://bugzilla.wikimedia.org/show_bug.cgi?id=34778
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
http://dumps.wikimedia.org/archive/ If you really need historical dumps there are a few there.
Ariel
Στις 04-06-2012, ημέρα Δευ, και ώρα 19:23 +0200, ο/η Petr Onderka έγραψε:
Why don't you use pages-meta-history? It should contain all the information you want, no matter when dumps were made.
Petr Onderka [[en:User:Svick]]
On Mon, Jun 4, 2012 at 7:15 PM, Pablo Mendes pablomendes@gmail.com wrote:
Hi all, I was just time travelling at http://dumps.wikimedia.org/enwiki/ and the oldest dump I could find was: 20110526.
I am building an evaluation dataset that needs the state of Wikipedia in 2010. Is there a way I could get my hands on a 2010 version of pages-articles.xml.bz2 ?
Thank you, Pablo
PS: for anybody interested in navigating wikipedia's history nicely: https://bugzilla.wikimedia.org/show_bug.cgi?id=34778
Out of curiosity, when is a dump considered historical and moved to archive/?
Thanks, Diane
-----Original Message----- From: xmldatadumps-l-bounces@lists.wikimedia.org [mailto:xmldatadumps-l-bounces@lists.wikimedia.org] On Behalf Of Ariel T. Glenn Sent: Monday, June 04, 2012 1:44 PM To: Petr Onderka Cc: xmldatadumps-l@lists.wikimedia.org Subject: Re: [Xmldatadumps-l] Dump from 2010?
http://dumps.wikimedia.org/archive/ If you really need historical dumps there are a few there.
Ariel
Στις 04-06-2012, ημέρα Δευ, και ώρα 19:23 +0200, ο/η Petr Onderka έγραψε:
Why don't you use pages-meta-history? It should contain all the information you want, no matter when dumps were made.
Petr Onderka [[en:User:Svick]]
On Mon, Jun 4, 2012 at 7:15 PM, Pablo Mendes pablomendes@gmail.com wrote:
Hi all, I was just time travelling at http://dumps.wikimedia.org/enwiki/ and the oldest dump I could find was: 20110526.
I am building an evaluation dataset that needs the state of Wikipedia in 2010. Is there a way I could get my hands on a 2010 version of pages-articles.xml.bz2 ?
Thank you, Pablo
PS: for anybody interested in navigating wikipedia's history nicely: https://bugzilla.wikimedia.org/show_bug.cgi?id=34778
_______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
It's ad hoc right now but basically we keep about a year's worth, maybe a little less, of everything as "current"... this will likely shrink as dumps get larger, but that's over a good period of time.
Ariel
Στις 04-06-2012, ημέρα Δευ, και ώρα 11:02 -0700, ο/η Napolitano, Diane έγραψε:
Out of curiosity, when is a dump considered historical and moved to archive/?
Thanks, Diane
-----Original Message----- From: xmldatadumps-l-bounces@lists.wikimedia.org [mailto:xmldatadumps-l-bounces@lists.wikimedia.org] On Behalf Of Ariel T. Glenn Sent: Monday, June 04, 2012 1:44 PM To: Petr Onderka Cc: xmldatadumps-l@lists.wikimedia.org Subject: Re: [Xmldatadumps-l] Dump from 2010?
http://dumps.wikimedia.org/archive/ If you really need historical dumps there are a few there.
Ariel
Στις 04-06-2012, ημέρα Δευ, και ώρα 19:23 +0200, ο/η Petr Onderka έγραψε:
Why don't you use pages-meta-history? It should contain all the information you want, no matter when dumps were made.
Petr Onderka [[en:User:Svick]]
On Mon, Jun 4, 2012 at 7:15 PM, Pablo Mendes pablomendes@gmail.com wrote:
Hi all, I was just time travelling at http://dumps.wikimedia.org/enwiki/ and the oldest dump I could find was: 20110526.
I am building an evaluation dataset that needs the state of Wikipedia in 2010. Is there a way I could get my hands on a 2010 version of pages-articles.xml.bz2 ?
Thank you, Pablo
PS: for anybody interested in navigating wikipedia's history nicely: https://bugzilla.wikimedia.org/show_bug.cgi?id=34778
xmldatadumps-l@lists.wikimedia.org