Hi, is there a file somewhere with a list of all namespace names and numbers for all the Wikimedia wikis. The list should at least have the canonical names, but preferably also any aliases. This is useful to find out which namespaces interwiki links and language links goes to.
The canonical names are in the siteinfo section of the XML dumps of each wiki, but not the aliases - and it is not practical to download complete dumps for all projects just to get namespace names.
Regards, - Byrial
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=...
should be what you need.
On Tue, Jul 2, 2013 at 3:11 PM, Byrial Jensen byrial@vip.cybercity.dkwrote:
Hi, is there a file somewhere with a list of all namespace names and numbers for all the Wikimedia wikis. The list should at least have the canonical names, but preferably also any aliases. This is useful to find out which namespaces interwiki links and language links goes to.
The canonical names are in the siteinfo section of the XML dumps of each wiki, but not the aliases - and it is not practical to download complete dumps for all projects just to get namespace names.
Regards,
- Byrial
______________________________**_________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.**wikimedia.org Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/xmldatadumps-**lhttps://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
At 02-07-2013 21:49, John wrote:
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=...
should be what you need.
Thank you. If I change to a nonformatted format like xml or json, and repeat for all 286 Wikipedias (first, later I may also need other projects) it will do with some work. It is good I have curl to help here ...
Regards, - Byrial
DBpedia has a file [1] which contains all the data you need, albeit wrapped in Scala code. This file is generated by another Scala class [2] that calls the Wikipedia API URLs that John pointed out. You could modify that class to produce any format you like. If you don't know Scala, you might be better off writing the code from scratch though.
JC
[1] https://github.com/dbpedia/extraction-framework/blob/dump/core/src/main/scal... [2] https://github.com/dbpedia/extraction-framework/blob/dump/core/src/main/scal...
On 3 July 2013 00:18, Byrial Jensen byrial@vip.cybercity.dk wrote:
At 02-07-2013 21:49, John wrote:
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=...
should be what you need.
Thank you. If I change to a nonformatted format like xml or json, and repeat for all 286 Wikipedias (first, later I may also need other projects) it will do with some work. It is good I have curl to help here ...
Regards,
- Byrial
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
At 08-07-2013 02:42, Jona Christopher Sahnwaldt wrote:
DBpedia has a file [1] which contains all the data you need, albeit wrapped in Scala code. This file is generated by another Scala class [2] that calls the Wikipedia API URLs that John pointed out. You could modify that class to produce any format you like. If you don't know Scala, you might be better off writing the code from scratch though.
Thank you, but I have already got the needed data by writing a small program to first get a list of Wikipedias from http://noc.wikimedia.org/conf/wikipedia.dblist and then query each of them. My program (in C with help from CURL, available at http://toolserver.org/~byrial/wikidata-programs/get_namespaces.c) is not elegant, but it does the job.
Regards, - Byrial
On 8 July 2013 03:00, Byrial Jensen byrial@vip.cybercity.dk wrote:
At 08-07-2013 02:42, Jona Christopher Sahnwaldt wrote:
DBpedia has a file [1] which contains all the data you need, albeit wrapped in Scala code. This file is generated by another Scala class [2] that calls the Wikipedia API URLs that John pointed out. You could modify that class to produce any format you like. If you don't know Scala, you might be better off writing the code from scratch though.
Thank you, but I have already got the needed data by writing a small program to first get a list of Wikipedias from http://noc.wikimedia.org/conf/wikipedia.dblist and then query each of them. My program (in C with help from CURL, available at http://toolserver.org/~byrial/wikidata-programs/get_namespaces.c) is not elegant, but it does the job.
Looks good.
You initally said that you also want the aliases. You can get them if you add "|namespacealiases" to the URL, as John pointed out, and parse their XML elements. But I guess you already know that. :-)
Regards,
- Byrial
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Den 08-07-2013 15:20, Jona Christopher Sahnwaldt skrev:
On 8 July 2013 03:00, Byrial Jensen byrial@vip.cybercity.dk wrote:
At 08-07-2013 02:42, Jona Christopher Sahnwaldt wrote:
DBpedia has a file [1] which contains all the data you need, albeit wrapped in Scala code. This file is generated by another Scala class [2] that calls the Wikipedia API URLs that John pointed out. You could modify that class to produce any format you like. If you don't know Scala, you might be better off writing the code from scratch though.
Thank you, but I have already got the needed data by writing a small program to first get a list of Wikipedias from http://noc.wikimedia.org/conf/wikipedia.dblist and then query each of them. My program (in C with help from CURL, available at http://toolserver.org/~byrial/wikidata-programs/get_namespaces.c) is not elegant, but it does the job.
Looks good.
You initally said that you also want the aliases. You can get them if you add "|namespacealiases" to the URL, as John pointed out, and parse their XML elements. But I guess you already know that. :-)
Yes, I knew that, but thank you anyway. The current version of my program will get both namespace names and aliases for all Wikipedias and Wikivoyages. I added Wikivoyage because their language links is now being migrated to Wikidata, and I use this to analyse the Wikidata database. If anyone need namespace names for other Wikimedia projects, the program is easy to adapt.
Regards, - Byrial
At 08-07-2013 15:20, Jona Christopher Sahnwaldt wrote:
On 8 July 2013 03:00, Byrial Jensen byrial@vip.cybercity.dk wrote:
At 08-07-2013 02:42, Jona Christopher Sahnwaldt wrote:
DBpedia has a file [1] which contains all the data you need, albeit wrapped in Scala code. This file is generated by another Scala class [2] that calls the Wikipedia API URLs that John pointed out. You could modify that class to produce any format you like. If you don't know Scala, you might be better off writing the code from scratch though.
Thank you, but I have already got the needed data by writing a small program to first get a list of Wikipedias from http://noc.wikimedia.org/conf/wikipedia.dblist and then query each of them. My program (in C with help from CURL, available at http://toolserver.org/~byrial/wikidata-programs/get_namespaces.c) is not elegant, but it does the job.
Looks good.
You initally said that you also want the aliases. You can get them if you add "|namespacealiases" to the URL, as John pointed out, and parse their XML elements. But I guess you already know that. :-)
Yes, I knew that, but thank you anyway. The current version of my program will get both namespace names and aliases for all Wikipedias and Wikivoyages. I added Wikivoyage because their language links is now being migrated to Wikidata, and I use this to analyse the Wikidata database. If anyone need namespace names for other Wikimedia projects, the program is easy to adapt.
Regards, - Byrial
xmldatadumps-l@lists.wikimedia.org