Hi,
I understand from http://www.wikidata.org/wiki/Wikidata:News that - enwiki since February 13, 2013 - hewiki and itwiki since January 30, 2013 - huwiki January 14, 2013 have migrated to the Wikidata project. And more wikis will follow shortly.
One consequence is that wiki markup for interwiki links (cross-language links) are being gradually removed from articles, because the MediaWiki software can now read them from the centralized Wikidata repository.
I verified in the latest huwiki dump that some articles indeed no more have interwiki links. Do you confirm my above statements?
How can I now extract interwiki links from dumps? Is there a separate Wikidata dump I should download? What attributes for look for to join Wikidata and separate language wiki dumps? Thanks for your help.
-François
François Bonzon, 04/03/2013 16:35:
How can I now extract interwiki links from dumps? Is there a separate Wikidata dump I should download? What attributes for look for to join Wikidata and separate language wiki dumps? Thanks for your help.
http://dumps.wikimedia.org/huwiki/20130224/huwiki-20130224-langlinks.sql.gz https://www.mediawiki.org/wiki/Manual:Langlinks_table
Nemo
Thanks Nemo.
I confirm I now see interwiki language links originating from Wikidata in <language>wiki-<date>-langlinks.sql.gz dumps, with the format described in the 2nd link you sent. However, this is a MySQL dump, not a XML dump.
Language links are then no more available in XML data dumps?
On Mon, Mar 4, 2013 at 4:45 PM, Federico Leva (Nemo) nemowiki@gmail.comwrote:
François Bonzon, 04/03/2013 16:35:
How can I now extract interwiki links from dumps? Is there a separate
Wikidata dump I should download? What attributes for look for to join Wikidata and separate language wiki dumps? Thanks for your help.
http://dumps.wikimedia.org/**huwiki/20130224/huwiki-** 20130224-langlinks.sql.gzhttp://dumps.wikimedia.org/huwiki/20130224/huwiki-20130224-langlinks.sql.gz https://www.mediawiki.org/**wiki/Manual:Langlinks_tablehttps://www.mediawiki.org/wiki/Manual:Langlinks_table
Nemo
Indeed, that will be an issue for everyone that consumes Wikipedia data automatically, especially as more structured data (e.g. infobox) will eventually move from MediaWiki to Wikidata. DBpedia will have the same issue at one point.
Nicolas.
-- Nicolas Torzec Yahoo! Labs.
From: François Bonzon <francois.bonzon@gmail.commailto:francois.bonzon@gmail.com> Date: Monday, March 4, 2013 7:35 AM To: "xmldatadumps-l@lists.wikimedia.orgmailto:xmldatadumps-l@lists.wikimedia.org" <xmldatadumps-l@lists.wikimedia.orgmailto:xmldatadumps-l@lists.wikimedia.org> Subject: [Xmldatadumps-l] Wikidata project and interwiki links removed in wiki text
Hi,
I understand from http://www.wikidata.org/wiki/Wikidata:News that - enwiki since February 13, 2013 - hewiki and itwiki since January 30, 2013 - huwiki January 14, 2013 have migrated to the Wikidata project. And more wikis will follow shortly.
One consequence is that wiki markup for interwiki links (cross-language links) are being gradually removed from articles, because the MediaWiki software can now read them from the centralized Wikidata repository.
I verified in the latest huwiki dump that some articles indeed no more have interwiki links. Do you confirm my above statements?
How can I now extract interwiki links from dumps? Is there a separate Wikidata dump I should download? What attributes for look for to join Wikidata and separate language wiki dumps? Thanks for your help.
-François
xmldatadumps-l@lists.wikimedia.org