Hi. Currently, the dump service offers two different dumps for wikidata:
* XML: http://dumps.wikimedia.org/wikidatawiki/latest/ * JSON: http://dumps.wikimedia.org/wikidatawiki/entities/
According to http://www.wikidata.org/wiki/Wikidata:Database_download, the JSON dump is listed as the recommended dump format. Also, at the time of writing, the JSON dump has been generating regularly every week whereas the XML dump has been delayed for 2+ months.
Going forward, will both dumps continue to be supported? Or will the XML dump be phased out and only the JSON dump remain? Or are these plans still to be determined based on upcoming changes to the dumping infrastructure as per https://phabricator.wikimedia.org/T88728?
If the JSON dump is to be the sole data format, is there any way to address the following omissions?
* '''Non-JSON pages not available''': The JSON dump only provides JSON content-type pages in the main and property namespaces. Pages in other namespaces are not available, including the Main Page. For example, here are the counts from the 2015-03-30 dump
id name count ---- ------------ ----- 4 Wikidata 10280 8 MediaWiki 2244 10 Template 4701 12 Help 779 14 Category 3073 828 Module 175 1198 Translations 83524
* '''Page metadata not available''' : For the JSON pages, the page_touched and page_id is not available. * '''Other tables not provided''': Other tables are not provided, notably categorylinks and page_props
Thanks in advance for any information.