Indeed, that will be an issue for everyone that consumes Wikipedia data automatically, especially as more structured data (e.g. infobox) will eventually move from MediaWiki to Wikidata. DBpedia will have the same issue at one point.

Nicolas.

--
Nicolas Torzec
Yahoo! Labs.





 

From: François Bonzon <francois.bonzon@gmail.com>
Date: Monday, March 4, 2013 7:35 AM
To: "xmldatadumps-l@lists.wikimedia.org" <xmldatadumps-l@lists.wikimedia.org>
Subject: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text

Hi,

I understand from http://www.wikidata.org/wiki/Wikidata:News that
- enwiki since February 13, 2013
- hewiki and itwiki since January 30, 2013
- huwiki January 14, 2013
have migrated to the Wikidata project. And more wikis will follow shortly.

One consequence is that wiki markup for interwiki links (cross-language links) are being gradually removed from articles, because the MediaWiki software can now read them from the centralized Wikidata repository.

I verified in the latest huwiki dump that some articles indeed no more have interwiki links. Do you confirm my above statements?

How can I now extract interwiki links from dumps? Is there a separate Wikidata dump I should download? What attributes for look for to join Wikidata and separate language wiki dumps? Thanks for your help.

-François