Hey Marieke,

You can either use the Wikidata toolkit by Markus Krötzsch, if you want to work on the dump, or the Wikidata web API, if you only need a few such mappings at a time.

On Jul 17, 2014 9:24 AM, "Erp, M.G.J. van" <marieke.van.erp@vu.nl> wrote:
Hi there,

I was wondering how to get the language mappings between different wikipedia pages. This information seems to be available on Wikidata as I can find it through browsing different pages on Wikidata such as http://www.wikidata.org/wiki/Q213710 and the https://www.mediawiki.org/wiki/Manual:Langlinks_table mentions a langlinks table, but I can't figure out how to get a dump.

The "Wiki interlanguage link records" at http://dumps.wikimedia.org/wikidatawiki/20140705/ looked promising but that seems to contain user information if I'm not mistaken. For example, " select count(*), ll_title from langlinks group by 2 order by 1 desc limit 20;” results in:

+----------+--------------------------------------+
| count(*) | ll_title                             |
+----------+--------------------------------------+
|      284 | User:تفکر                            |
|      272 | user:OffsBlink                       |
|      215 | User:YourEyesOnly                    |
|      179 | User:MoiraMoira                      |
|       65 | User:AvocatoBot                      |
|       35 | User:Shikai shaw                     |
|       35 | user:Shuaib-bot                      |
|       33 | user:לערי ריינהארט                   |
|       33 | User:Leyo                            |
|       27 | user:Лобачев Владимир                |
|       20 | User:Wagino 20100516                 |
|       18 | user:Gangleri                        |
|       17 | user:I18n                            |
|       16 | user:Meursault2004                   |
|       12 | User:Labant                          |
|       11 | User:Stryn                           |
|       11 | User:angelia2041                     |
|       10 | user:Kelvin                          |
|       10 | User:JCIV                            |
|        9 | Template:Mbox                        |
+----------+———————————————————+

I checked out  the #mediawiki IRC channel someone recommended the "Interwiki link tracking records" but those seem to also contain al sorts of other links, and I don't see a way to filter out the "in other languages" links. It would be great if you could help me out.

Thanks!

Marieke van Erp



--
Computational Lexicology & Terminology Lab (CLTL)
The Network Institute, VU University Amsterdam

De Boelelaan 1105
1081 HV  Amsterdam, The Netherlands
http://www.mariekevanerp.com
http://www.newsreader-project.eu



_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l