Good question, but I don't need to do that. I just need to extract all possible contexts for every link to a specific target. I don't compare different revisions of an article directly, it's merely a comparison of which knowledge could be extracted with links from different versions.
Best, Nicolai ________________________________ Von: Napolitano, Diane [dnapolitano@ets.org] Gesendet: Freitag, 27. April 2012 17:15 An: Nicolai Erbs Cc: xmldatadumps-l@lists.wikimedia.org Betreff: RE: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Cool!! But if a page exists in August 2008 and not in the current dump, how are you going to compare the two pages?
- Diane
From: xmldatadumps-l-bounces@lists.wikimedia.org [mailto:xmldatadumps-l-bounces@lists.wikimedia.org] On Behalf Of Nicolai Erbs Sent: Friday, April 27, 2012 10:29 AM To: Diederik van Liere; emijrp Cc: xmldatadumps-l@lists.wikimedia.org Subject: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Thanks for your answers so far!
I would like to compare contexts of links in two versions of Wikipedia for the purpose of named entity disambiguation (one is a current version and the other one should be from August, 2008).
It might be possible to reconstruct a version but this could be time-consuming. Additionally, wouldn't I miss those articles that have been deleted in the meantime?
Best, Nicolai ________________________________ Von: Diederik van Liere [dvanliere@gmail.com] Gesendet: Freitag, 27. April 2012 16:19 An: emijrp Cc: Nicolai Erbs; xmldatadumps-l@lists.wikimedia.org Betreff: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008) Hi, Why do you need a dump from 2008? You can use a recent dump and only analyze the data up to 20080103 Best, Diederik On Fri, Apr 27, 2012 at 10:16 AM, emijrp <emijrp@gmail.commailto:emijrp@gmail.com> wrote:
2012/4/27 Nicolai Erbs <erbs@ukp.informatik.tu-darmstadt.demailto:erbs@ukp.informatik.tu-darmstadt.de> English, please.
Here you have one English Wikipedia dump from 20080103 http://dumps.wikimedia.org/archive/ But I remember some old dumps were corrupted.
Ariel, is that dump OK?
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOThttp://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidenshttp://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeamhttp://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/
_______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.orgmailto:Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l