Indeed, that will be an issue for everyone that consumes Wikipedia data automatically,
especially as more structured data (e.g. infobox) will eventually move from MediaWiki to
Wikidata. DBpedia will have the same issue at one point.
Nicolas.
--
Nicolas Torzec
Yahoo! Labs.
From: François Bonzon
<francois.bonzon@gmail.com<mailto:francois.bonzon@gmail.com>>
Date: Monday, March 4, 2013 7:35 AM
To:
"xmldatadumps-l@lists.wikimedia.org<mailto:xmldatadumps-l@lists.wikimedia.org>"
<xmldatadumps-l@lists.wikimedia.org<mailto:xmldatadumps-l@lists.wikimedia.org>>
Subject: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text
Hi,
I understand from
http://www.wikidata.org/wiki/Wikidata:News that
- enwiki since February 13, 2013
- hewiki and itwiki since January 30, 2013
- huwiki January 14, 2013
have migrated to the Wikidata project. And more wikis will follow shortly.
One consequence is that wiki markup for interwiki links (cross-language links) are being
gradually removed from articles, because the MediaWiki software can now read them from the
centralized Wikidata repository.
I verified in the latest huwiki dump that some articles indeed no more have interwiki
links. Do you confirm my above statements?
How can I now extract interwiki links from dumps? Is there a separate Wikidata dump I
should download? What attributes for look for to join Wikidata and separate language wiki
dumps? Thanks for your help.
-François