Dear all,
I'm looking for an old Wikipedia dump (August 8th, 2008). Any ideas where I can get it?
Thanks in advance! Nicolai
2012/4/27 Nicolai Erbs erbs@ukp.informatik.tu-darmstadt.de
Dear all,
I'm looking for an old Wikipedia dump (August 8th, 2008). Any ideas where I can get it?
Which language?
English, please. ________________________________ Von: emijrp [emijrp@gmail.com] Gesendet: Freitag, 27. April 2012 16:11 An: Nicolai Erbs Cc: xmldatadumps-l@lists.wikimedia.org Betreff: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
2012/4/27 Nicolai Erbs <erbs@ukp.informatik.tu-darmstadt.demailto:erbs@ukp.informatik.tu-darmstadt.de> Dear all,
I'm looking for an old Wikipedia dump (August 8th, 2008). Any ideas where I can get it?
Which language?
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOThttp://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidenshttp://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeamhttp://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/
2012/4/27 Nicolai Erbs erbs@ukp.informatik.tu-darmstadt.de
English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
Hi, Why do you need a dump from 2008? You can use a recent dump and only analyze the data up to 20080103 Best, Diederik
On Fri, Apr 27, 2012 at 10:16 AM, emijrp emijrp@gmail.com wrote:
2012/4/27 Nicolai Erbs erbs@ukp.informatik.tu-darmstadt.de
English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOT http://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidens http://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeam http://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Thanks for your answers so far!
I would like to compare contexts of links in two versions of Wikipedia for the purpose of named entity disambiguation (one is a current version and the other one should be from August, 2008).
It might be possible to reconstruct a version but this could be time-consuming. Additionally, wouldn't I miss those articles that have been deleted in the meantime?
Best, Nicolai ________________________________ Von: Diederik van Liere [dvanliere@gmail.com] Gesendet: Freitag, 27. April 2012 16:19 An: emijrp Cc: Nicolai Erbs; xmldatadumps-l@lists.wikimedia.org Betreff: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Hi, Why do you need a dump from 2008? You can use a recent dump and only analyze the data up to 20080103 Best, Diederik
On Fri, Apr 27, 2012 at 10:16 AM, emijrp <emijrp@gmail.commailto:emijrp@gmail.com> wrote:
2012/4/27 Nicolai Erbs <erbs@ukp.informatik.tu-darmstadt.demailto:erbs@ukp.informatik.tu-darmstadt.de> English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOThttp://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidenshttp://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeamhttp://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/
_______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.orgmailto:Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Cool!! But if a page exists in August 2008 and not in the current dump, how are you going to compare the two pages?
- Diane
From: xmldatadumps-l-bounces@lists.wikimedia.org [mailto:xmldatadumps-l-bounces@lists.wikimedia.org] On Behalf Of Nicolai Erbs Sent: Friday, April 27, 2012 10:29 AM To: Diederik van Liere; emijrp Cc: xmldatadumps-l@lists.wikimedia.org Subject: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Thanks for your answers so far!
I would like to compare contexts of links in two versions of Wikipedia for the purpose of named entity disambiguation (one is a current version and the other one should be from August, 2008).
It might be possible to reconstruct a version but this could be time-consuming. Additionally, wouldn't I miss those articles that have been deleted in the meantime?
Best, Nicolai ________________________________ Von: Diederik van Liere [dvanliere@gmail.com] Gesendet: Freitag, 27. April 2012 16:19 An: emijrp Cc: Nicolai Erbs; xmldatadumps-l@lists.wikimedia.org Betreff: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008) Hi, Why do you need a dump from 2008? You can use a recent dump and only analyze the data up to 20080103 Best, Diederik On Fri, Apr 27, 2012 at 10:16 AM, emijrp <emijrp@gmail.commailto:emijrp@gmail.com> wrote:
2012/4/27 Nicolai Erbs <erbs@ukp.informatik.tu-darmstadt.demailto:erbs@ukp.informatik.tu-darmstadt.de> English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOThttp://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidenshttp://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeamhttp://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/
_______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.orgmailto:Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Good question, but I don't need to do that. I just need to extract all possible contexts for every link to a specific target. I don't compare different revisions of an article directly, it's merely a comparison of which knowledge could be extracted with links from different versions.
Best, Nicolai ________________________________ Von: Napolitano, Diane [dnapolitano@ets.org] Gesendet: Freitag, 27. April 2012 17:15 An: Nicolai Erbs Cc: xmldatadumps-l@lists.wikimedia.org Betreff: RE: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Cool!! But if a page exists in August 2008 and not in the current dump, how are you going to compare the two pages?
- Diane
From: xmldatadumps-l-bounces@lists.wikimedia.org [mailto:xmldatadumps-l-bounces@lists.wikimedia.org] On Behalf Of Nicolai Erbs Sent: Friday, April 27, 2012 10:29 AM To: Diederik van Liere; emijrp Cc: xmldatadumps-l@lists.wikimedia.org Subject: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Thanks for your answers so far!
I would like to compare contexts of links in two versions of Wikipedia for the purpose of named entity disambiguation (one is a current version and the other one should be from August, 2008).
It might be possible to reconstruct a version but this could be time-consuming. Additionally, wouldn't I miss those articles that have been deleted in the meantime?
Best, Nicolai ________________________________ Von: Diederik van Liere [dvanliere@gmail.com] Gesendet: Freitag, 27. April 2012 16:19 An: emijrp Cc: Nicolai Erbs; xmldatadumps-l@lists.wikimedia.org Betreff: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008) Hi, Why do you need a dump from 2008? You can use a recent dump and only analyze the data up to 20080103 Best, Diederik On Fri, Apr 27, 2012 at 10:16 AM, emijrp <emijrp@gmail.commailto:emijrp@gmail.com> wrote:
2012/4/27 Nicolai Erbs <erbs@ukp.informatik.tu-darmstadt.demailto:erbs@ukp.informatik.tu-darmstadt.de> English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOThttp://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidenshttp://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeamhttp://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/
_______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.orgmailto:Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Στις 27-04-2012, ημέρα Παρ, και ώρα 16:16 +0200, ο/η emijrp έγραψε:
2012/4/27 Nicolai Erbs erbs@ukp.informatik.tu-darmstadt.de English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
As far as I know it's ok.
Ariel
xmldatadumps-l@lists.wikimedia.org