Hi,
I noticed that the response from " http://www.wikidata.org/w/api.php?action=query&titles=Q1&prop=revisi..." changed from "entity":"q1" to "entity":["item",1]. Is this change applied to all pages?
In the latest wikidata dump ( http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-met...), both formats exist at the same time. For example, page Q100 has: "entity":["item",100], while page Q100000 has "entity":"q100000". Is it expected? Will the next dump have same format? By the way, " http://www.wikidata.org/w/api.php?action=query&titles=Q100000&prop=r..." return "entity":["item",100000].
Thanks.
Hi Anthony,
that's the internal data structure, and this is bound to change without notice. I am sorry if this caused trouble.
If this is a common concern, we will start documenting and announcing those changes. It really should only concern the people processing the XML dumps.
We would prefer to actually create a more stable output dump of the knowledge - I guess this would be more appreciated (like the RDF dump that Markus has posted about recently).
The call to get the item description should have been
https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&ids=Q1
This should provide you with a more stable answer.
Cheers, Denny
2013/8/1 Huidong Zhang anthonyzhang@google.com
Hi,
I noticed that the response from " http://www.wikidata.org/w/api.php?action=query&titles=Q1&prop=revisi..." changed from "entity":"q1" to "entity":["item",1]. Is this change applied to all pages?
In the latest wikidata dump ( http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-met...), both formats exist at the same time. For example, page Q100 has: "entity":["item",100], while page Q100000 has "entity":"q100000". Is it expected? Will the next dump have same format? By the way, " http://www.wikidata.org/w/api.php?action=query&titles=Q100000&prop=r..." return "entity":["item",100000].
Thanks.
-- Best wishes, Anthony Zhang (Huidong Zhang)
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Wed, Aug 7, 2013 at 10:11 PM, Denny Vrandečić < denny.vrandecic@wikimedia.de> wrote:
Hi Anthony,
that's the internal data structure, and this is bound to change without notice. I am sorry if this caused trouble.
If this is a common concern, we will start documenting and announcing those changes. It really should only concern the people processing the XML dumps.
We would prefer to actually create a more stable output dump of the knowledge - I guess this would be more appreciated (like the RDF dump that Markus has posted about recently).
The call to get the item description should have been
< https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&...
This should provide you with a more stable answer.
Cheers, Denny
2013/8/1 Huidong Zhang anthonyzhang@google.com
Hi,
I noticed that the response from " http://www.wikidata.org/w/api.php?action=query&titles=Q1&prop=revisi..." changed from "entity":"q1" to "entity":["item",1]. Is this change applied to all pages?
In the latest wikidata dump ( http://dumps.wikimedia.org/wikidatawiki/latest/wikidatawiki-latest-pages-met...), both formats exist at the same time. For example, page Q100 has: "entity":["item",100], while page Q100000 has "entity":"q100000". Is it expected? Will the next dump have same format? By the way, " http://www.wikidata.org/w/api.php?action=query&titles=Q100000&prop=r..." return "entity":["item",100000].
About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that).
Thanks.
-- Best wishes, Anthony Zhang (Huidong Zhang)
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Project director Wikidata Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin Tel. +49-30-219 158 26-0 | http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 10-08-2013 10:54, Jiang BIAN wrote:
On Wed, Aug 7, 2013 at 10:11 PM, Denny Vrandečić <denny.vrandecic@wikimedia.de mailto:denny.vrandecic@wikimedia.de> wrote:
Hi Anthony, that's the internal data structure, and this is bound to change without notice. I am sorry if this caused trouble. If this is a common concern, we will start documenting and announcing those changes. It really should only concern the people processing the XML dumps.
I am one of the people processing the XML dumps, and I don't think it is a big deal. But I have had to change my parser many times to be able to parse new dumps because of changes in the format (in most cases, but not always, because of new features),
I just adapt to the changes without fuss, but if the format was documented I could file bug reports whenever the format is deviating from the documentation which might be helpful to the developers.
(BTW, the time values seems to be OK again, after many syntax errors in the beginning. But the coordinate values have some strange (probably erroneous?) variations: Values where the precision and/or globe is given as "null", and values where the globe is given as the string "earth" instead of an entity).
About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that).
Not for my sake. I adapted to two entity formats in the dumps immediately when the new format started to appear.
Best regards, - Byrial
On 10/08/13 10:29, Byrial Jensen wrote: ...
(BTW, the time values seems to be OK again, after many syntax errors in the beginning. But the coordinate values have some strange (probably erroneous?) variations: Values where the precision and/or globe is given as "null", and values where the globe is given as the string "earth" instead of an entity).
Thanks for the warning. This was something that has been causing problems in the RDF dump too. I am now validating the globe settings more carefully.
Cheers,
Markus
About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that).
Not for my sake. I adapted to two entity formats in the dumps immediately when the new format started to appear.
Best regards,
- Byrial
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Am 10.08.2013 16:54, schrieb Jiang BIAN:
About the inconsistency in the dump file, is there any bug entry created for this? (I can create one, if anyone can point me the proper place to do that).
It's not a bug, and it can't really be fixed: the dumps contains the revisions as they are. The internal format changes over time. Pages that have been modified after the change will use the new version, older pages will use the old format. There's really not much we can do about it.
I do agree though that we should provide JSON dumps using the stable external format.
-- daniel
So is there a spec about the stable external format?
If you could include a version number of the format used by the data, it will be much easier to write compatible code and/or notice the changes immediately.
On Sat, Aug 10, 2013 at 3:24 AM, Daniel Kinzler <daniel.kinzler@wikimedia.de
wrote:
Am 10.08.2013 16:54, schrieb Jiang BIAN:
About the inconsistency in the dump file, is there any bug entry created
for this? (I can create one, if anyone can point me the proper place to do that).
It's not a bug, and it can't really be fixed: the dumps contains the revisions as they are. The internal format changes over time. Pages that have been modified after the change will use the new version, older pages will use the old format. There's really not much we can do about it.
I do agree though that we should provide JSON dumps using the stable external format.
-- daniel
Am 10.08.2013 22:42, schrieb Jiang BIAN:
So is there a spec about the stable external format?
If you could include a version number of the format used by the data, it will be much easier to write compatible code and/or notice the changes immediately.
I don't think there's a formal spec, though we really should have one. And the version number is a good idea. Put it on bugzilla, please :)
-- daniel
Just saw that Daniel already submitted this at bugzilla. I think that voting on the bugs can speed things up, right? ;)
https://bugzilla.wikimedia.org/show_bug.cgi?id=52801 https://bugzilla.wikimedia.org/show_bug.cgi?id=52802
Cheers, Dimitris
On Sun, Aug 11, 2013 at 10:20 AM, Daniel Kinzler < daniel.kinzler@wikimedia.de> wrote:
Am 10.08.2013 22:42, schrieb Jiang BIAN:
So is there a spec about the stable external format?
If you could include a version number of the format used by the data, it will be much easier to write compatible code and/or notice the changes immediately.
I don't think there's a formal spec, though we really should have one. And the version number is a good idea. Put it on bugzilla, please :)
-- daniel
______________________________**_________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l
Actually, yes. We do take votes into account (but they do not decide the priority).
2013/8/21 Dimitris Kontokostas kontokostas@informatik.uni-leipzig.de
Just saw that Daniel already submitted this at bugzilla. I think that voting on the bugs can speed things up, right? ;)
https://bugzilla.wikimedia.org/show_bug.cgi?id=52801 https://bugzilla.wikimedia.org/show_bug.cgi?id=52802
Cheers, Dimitris
On Sun, Aug 11, 2013 at 10:20 AM, Daniel Kinzler < daniel.kinzler@wikimedia.de> wrote:
Am 10.08.2013 22:42, schrieb Jiang BIAN:
So is there a spec about the stable external format?
If you could include a version number of the format used by the data, it will be much easier to write compatible code and/or notice the changes immediately.
I don't think there's a formal spec, though we really should have one. And the version number is a good idea. Put it on bugzilla, please :)
-- daniel
______________________________**_________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikidata-lhttps://lists.wikimedia.org/mailman/listinfo/wikidata-l
-- Dimitris Kontokostas Department of Computer Science, University of Leipzig Research Group: http://aksw.org Homepage:http://aksw.org/DimitrisKontokostas
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l