Good question, but I don't need to do that.
I just need to extract all possible contexts for every link to a specific target. I
don't compare different revisions of an article directly, it's merely a comparison
of which knowledge could be extracted with links from different versions.
Best,
Nicolai
________________________________
Von: Napolitano, Diane [dnapolitano(a)ets.org]
Gesendet: Freitag, 27. April 2012 17:15
An: Nicolai Erbs
Cc: xmldatadumps-l(a)lists.wikimedia.org
Betreff: RE: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Cool!! But if a page exists in August 2008 and not in the current dump, how are you going
to compare the two pages?
- Diane
From: xmldatadumps-l-bounces(a)lists.wikimedia.org
[mailto:xmldatadumps-l-bounces@lists.wikimedia.org] On Behalf Of Nicolai Erbs
Sent: Friday, April 27, 2012 10:29 AM
To: Diederik van Liere; emijrp
Cc: xmldatadumps-l(a)lists.wikimedia.org
Subject: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Thanks for your answers so far!
I would like to compare contexts of links in two versions of Wikipedia for the purpose of
named entity disambiguation (one is a current version and the other one should be from
August, 2008).
It might be possible to reconstruct a version but this could be time-consuming.
Additionally, wouldn't I miss those articles that have been deleted in the meantime?
Best,
Nicolai
________________________________
Von: Diederik van Liere [dvanliere(a)gmail.com]
Gesendet: Freitag, 27. April 2012 16:19
An: emijrp
Cc: Nicolai Erbs; xmldatadumps-l(a)lists.wikimedia.org
Betreff: Re: [Xmldatadumps-l] Old dump for Wikipedia (August 8th, 2008)
Hi,
Why do you need a dump from 2008? You can use a recent dump and only analyze the data up
to 20080103
Best,
Diederik
On Fri, Apr 27, 2012 at 10:16 AM, emijrp
<emijrp@gmail.com<mailto:emijrp@gmail.com>> wrote:
2012/4/27 Nicolai Erbs
<erbs@ukp.informatik.tu-darmstadt.de<mailto:erbs@ukp.informatik.tu-darmstadt.de>>
English, please.
Here you have one English Wikipedia dump from 20080103
http://dumps.wikimedia.org/archive/
But I remember some old dumps were corrupted.
Ariel, is that dump OK?
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects:
AVBOT<http://code.google.com/p/avbot/> |
StatMediaWiki<http://statmediawiki.forja.rediris.es> |
WikiEvidens<http://code.google.com/p/wikievidens/> |
WikiPapers<http://wikipapers.referata.com> |
WikiTeam<http://code.google.com/p/wikiteam/>
Personal website:
https://sites.google.com/site/emijrp/
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l@lists.wikimedia.org<mailto:Xmldatadumps-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l