If you are also using the same software (Wikibase on
MediaWiki), the XML dumps
should Just Work (tm). The idea of the XML dumps is that the "text" blobs are
opaque to 3rd parties, but will continue to work with future versions of
MediaWiki & friends (with a compatible configuration - which is rather tricky).
Not sure I follow. Even from a Wikibase on MediaWiki perspective, the
XML dumps are still incomplete (since they're missing
mainsnak.datatype).
For example, consider the following:
* You download only the XML dump pages-articles.xml dump from
https://dumps.wikimedia.org/wikidatawiki/latest/
* You load it into MediaWiki
* You then create a module that looks like the Wikidata Module from
Russian Wikipedia:
https://ru.wikipedia.org/w/index.php?title=Module:Wikidata&action=edit
One line of the file specifically checks for datatype: "if datatype
and datatype == 'commonsMedia' then". This line always evaluates to
false, even though you are looking at an entity (Q38: Italy) and
property (P41: flag image) which does have a datatype for
"commonsMedia" (since the XML dump does not have
"mainsnak.datatype").
From a user standpoint, this means that if you're
trying to set up a
local version of Russian Wikipedia and Wikidata, then all
Country
infoboxes will not show the country's flag (the above line of code
will substitute text for the image)
The only way around this is to supplement the XML dump with the JSON
dump. But then, you'll need to download 2 large dumps and somehow
merge them. (I don't know if MediaWiki has a facility to load the JSON
dump, much less merge it)
Anyway, I understand that there are technical complications with
trying to add mainsnak.datatype to the XML dumps. But if this never
gets resolved, then the current situation basically offers two
unsatisfying options:
* Have an XML dump which is 99.9% complete but still missing key info
(mainsnak.datatype)
* Try to merge the JSON dump into the XML dump (which MediaWiki may
not be able to do)
Hope this makes sense.
Thanks.
On Sun, Nov 27, 2016 at 11:49 AM, Daniel Kinzler
<daniel.kinzler(a)wikimedia.de> wrote:
> Am 27.11.2016 um 01:15 schrieb gnosygnu:
>> This is useful, but unfortunately it won't suffice. Wikidata also has
>> pages which are wikitext (for example,
>>
https://www.wikidata.org/wiki/Wikidata:WikiProject_Names). These
>> wikitext pages are in the XML dumps, but aren't in the stub dumps nor
>> the JSON dumps. I actually do use these Wikidata wikitext entries to
>> try to reproduce Wikidata in its entirety.
>
If you are also using the same software (Wikibase on
MediaWiki), the XML dumps
should Just Work (tm). The idea of the XML dumps is that the "text" blobs are
opaque to 3rd parties, but will continue to work with future versions of
MediaWiki & friends (with a compatible configuration - which is rather tricky).
>
>
> --
> Daniel Kinzler
> Senior Software Developer
>
> Wikimedia Deutschland
> Gesellschaft zur Förderung Freien Wissens e.V.
>
> _______________________________________________
> Wikidata mailing list
> Wikidata(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wikidata