Is there a way to download the wikidata ontology?
Beste Grüße / kind regards Rüdiger Klein ______________________________________________________ Dr. Rüdiger Klein Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) Dept. Adaptive Reflective Teams (ART) Schloss Birlinghoven, 53754 Sankt Augustin, Germany Tel: +49 2241 14 2608 Fax: +49 2241 14 2342 E-Mail: ruediger.klein@iais.fraunhofer.de
Hello,
You can find it here: http://wikiba.se/ontology-1.0.owl
If you have questions regarding the ontology, feel free to ask.
Bests,
On 4 January 2017 at 10:42, Klein, Rüdiger < ruediger.klein@iais.fraunhofer.de> wrote:
Is there a way to download the wikidata ontology?
Beste Grüße / kind regards
Rüdiger Klein
Dr. Rüdiger Klein
Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) Dept. Adaptive Reflective Teams (ART)
Schloss Birlinghoven, 53754 Sankt Augustin, Germany
Tel: +49 2241 14 2608 <+49%202241%20142608>
Fax: +49 2241 14 2342 <+49%202241%20142342>
E-Mail: ruediger.klein@iais.fraunhofer.de
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Am 04.01.2017 um 11:00 schrieb Léa Lacroix:
Hello,
You can find it here: http://wikiba.se/ontology-1.0.owl
If you have questions regarding the ontology, feel free to ask.
Please note that this is the *wikibase* ontology, which thefines the meta-model for the information on Wikidata. It defines models statements, sitelinks, source references, etc.
This ontology does not model "real world" concepts or properties like location or color or children, etc. Modeling on this level is done on Wikidata itself, there is no fixed RDF or OWL schema or ontology.
The best you can get in terms of "downloading the wikidata ontology" would be to download all properties and all the items representing classes. We currently don't have a separate dump for these. Also, do not expect this to be a concise or consistent model that can be used for reasoning. You are bound to find contradictions and lose ends.
Hi Rüdiger,
Daniel refers to several independent aspects of Wikidata:
(1) The ontology is not separated from the data. Schematic information is mostly managed by encoding it in data as well. Therefore, if you want some of it (but not the rest), then some extraction will be necessary. The Wikidata SPARQL service is your friend for not-too-big (up to some 100K triples) on-the-fly data exports, enough to get the whole class hierarchy, for example. We also have created some ontology-like excerpts in the past [1]. These have been done offline by processing the data dump using Wikidata Toolkit.
(2) The ontology is very lightweight. Wikidata mostly encodes properties and their types, some hierarchical information on properties and classes, and some "weak" hints on things like domain and range for some properties. So there are no complex OWL axioms there. This is also the reason why the ontology should not contain any logical contradictions -- when Daniel refers to "contradictions" I guess he means incoherences in the overall modelling (which contradict human intuition).
(3) The ontology may change at any time. This is a consequence of (1) and the fact that Wikidata is controlled by a global community.
For all of these reasons, there cannot be one "Wikidata ontology" but there might still be many useful ontological things you can get without too much effort.
If you are interested in learning about the classes and properties used in Wikidata to get an informal idea of its current schema and content, then you could also browse this data in SQID [2].
Best regards,
Markus
[1] http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download... [2] https://tools.wmflabs.org/sqid/#/browse?type=properties
On 05.01.2017 16:15, Daniel Kinzler wrote:
Am 04.01.2017 um 11:00 schrieb Léa Lacroix:
Hello,
You can find it here: http://wikiba.se/ontology-1.0.owl
If you have questions regarding the ontology, feel free to ask.
Please note that this is the *wikibase* ontology, which thefines the meta-model for the information on Wikidata. It defines models statements, sitelinks, source references, etc.
This ontology does not model "real world" concepts or properties like location or color or children, etc. Modeling on this level is done on Wikidata itself, there is no fixed RDF or OWL schema or ontology.
The best you can get in terms of "downloading the wikidata ontology" would be to download all properties and all the items representing classes. We currently don't have a separate dump for these. Also, do not expect this to be a concise or consistent model that can be used for reasoning. You are bound to find contradictions and lose ends.
Hi!
The best you can get in terms of "downloading the wikidata ontology" would be to download all properties and all the items representing classes. We currently don't have a separate dump for these. Also, do not expect this to be a concise or consistent model that can be used for reasoning. You are bound to find contradictions and lose ends.
Also, Wikidata Toolkit (https://github.com/Wikidata/Wikidata-Toolkit) can be used to generate something like taxonomy - see e.g. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download...
But one has to be careful with it as Wikidata may not (and frequently does not) follow assumptions that are true for proper OWL models - there are no limits on what can be considered a class, a subclass, an instance, etc. Same entity can be treated both as class and individual, and there may be some weird structures, including even outright errors such as cycles in subclass graph, etc. And, of course, it changes all the time :)
Hi,
In case it helps, there is also an a few months old version from the latest DBpedia release for properties [1,2] and classes [3,4]. the properties do not contain the "rdf:type rdf:Property / owl:*Property" definitions and the current dump of the classes contain only subClassOf statements to DBpedia classes based on some mappings described in [5]
[1] http://downloads.dbpedia.org/2016-04/core-i18n/wikidata/properties_wikidata.... [2] http://downloads.dbpedia.org/preview.php?file=2016-04_sl_core-i18n_sl_wikida... (preview) [3] http://downloads.dbpedia.org/2016-04/core-i18n/wikidata/ontology_subclassof_... [4] http://downloads.dbpedia.org/preview.php?file=2016-04_sl_core-i18n_sl_wikida... (preview) [5] http://www.semantic-web-journal.net/content/wikidata-through-eyes-dbpedia-1
On Thu, Jan 5, 2017 at 11:21 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
The best you can get in terms of "downloading the wikidata ontology"
would be to
download all properties and all the items representing classes. We
currently
don't have a separate dump for these. Also, do not expect this to be a
concise
or consistent model that can be used for reasoning. You are bound to find contradictions and lose ends.
Also, Wikidata Toolkit (https://github.com/Wikidata/Wikidata-Toolkit) can be used to generate something like taxonomy - see e.g. http://tools.wmflabs.org/wikidata-exports/rdf/exports/ 20160801/dump_download.html
But one has to be careful with it as Wikidata may not (and frequently does not) follow assumptions that are true for proper OWL models - there are no limits on what can be considered a class, a subclass, an instance, etc. Same entity can be treated both as class and individual, and there may be some weird structures, including even outright errors such as cycles in subclass graph, etc. And, of course, it changes all the time :)
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Same entity can be treated both as class and individual
This is valid for OWL as well.
2017-01-05 22:21 GMT+01:00 Stas Malyshev smalyshev@wikimedia.org:
Hi!
The best you can get in terms of "downloading the wikidata ontology"
would be to
download all properties and all the items representing classes. We
currently
don't have a separate dump for these. Also, do not expect this to be a
concise
or consistent model that can be used for reasoning. You are bound to find contradictions and lose ends.
Also, Wikidata Toolkit (https://github.com/Wikidata/Wikidata-Toolkit) can be used to generate something like taxonomy - see e.g. http://tools.wmflabs.org/wikidata-exports/rdf/exports/ 20160801/dump_download.html
But one has to be careful with it as Wikidata may not (and frequently does not) follow assumptions that are true for proper OWL models - there are no limits on what can be considered a class, a subclass, an instance, etc. Same entity can be treated both as class and individual, and there may be some weird structures, including even outright errors such as cycles in subclass graph, etc. And, of course, it changes all the time :)
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 06.01.2017 18:24, Thomas Douillard wrote:
Same entity can be treated both as class and individual
This is valid for OWL as well.
Yes, and since Wikidata does not feature very powerful ontological statements, you could treat this like in OWL 2 DL semantically as well, i.e., a weak approach where the "class" and the "instance" are not really identified works.
Nevertheless, using the ontology might still be challenging depending on what you want to do with it, since there are quite a few meta-levels (classes of classes of classes ...) that are not cleanly separated. When I last checked, we even had some instance-of cycles ;-) Even this is not a technical problem for the OWL semantics, but maybe for some tools and approaches.
Cheers,
Markus
2017-01-05 22:21 GMT+01:00 Stas Malyshev <smalyshev@wikimedia.org mailto:smalyshev@wikimedia.org>:
Hi! > The best you can get in terms of "downloading the wikidata ontology" would be to > download all properties and all the items representing classes. We currently > don't have a separate dump for these. Also, do not expect this to be a concise > or consistent model that can be used for reasoning. You are bound to find > contradictions and lose ends. Also, Wikidata Toolkit (https://github.com/Wikidata/Wikidata-Toolkit <https://github.com/Wikidata/Wikidata-Toolkit>) can be used to generate something like taxonomy - see e.g. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html <http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html> But one has to be careful with it as Wikidata may not (and frequently does not) follow assumptions that are true for proper OWL models - there are no limits on what can be considered a class, a subclass, an instance, etc. Same entity can be treated both as class and individual, and there may be some weird structures, including even outright errors such as cycles in subclass graph, etc. And, of course, it changes all the time :) -- Stas Malyshev smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, The biggest casualty of the current mess is that people like me do not care at all about it. It cannot be explained, nobody is interested in explaining it and consequently there is little use for it. It is "must have" so it is there.. fine, lets move on. Thanks, GerardM
On 7 January 2017 at 10:39, Markus Kroetzsch <markus.kroetzsch@tu-dresden.de
wrote:
On 06.01.2017 18:24, Thomas Douillard wrote:
Same entity can be treated both as class and individual
This is valid for OWL as well.
Yes, and since Wikidata does not feature very powerful ontological statements, you could treat this like in OWL 2 DL semantically as well, i.e., a weak approach where the "class" and the "instance" are not really identified works.
Nevertheless, using the ontology might still be challenging depending on what you want to do with it, since there are quite a few meta-levels (classes of classes of classes ...) that are not cleanly separated. When I last checked, we even had some instance-of cycles ;-) Even this is not a technical problem for the OWL semantics, but maybe for some tools and approaches.
Cheers,
Markus
2017-01-05 22:21 GMT+01:00 Stas Malyshev <smalyshev@wikimedia.org mailto:smalyshev@wikimedia.org>:
Hi! > The best you can get in terms of "downloading the wikidata
ontology" would be to > download all properties and all the items representing classes. We currently > don't have a separate dump for these. Also, do not expect this to be a concise > or consistent model that can be used for reasoning. You are bound to find > contradictions and lose ends.
Also, Wikidata Toolkit (https://github.com/Wikidata/Wikidata-Toolkit <https://github.com/Wikidata/Wikidata-Toolkit>) can be used to generate something like taxonomy - see e.g. http://tools.wmflabs.org/wikidata-exports/rdf/exports/201608
01/dump_download.html http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160 801/dump_download.html
But one has to be careful with it as Wikidata may not (and frequently does not) follow assumptions that are true for proper OWL models -
there are no limits on what can be considered a class, a subclass, an instance, etc. Same entity can be treated both as class and individual, and there may be some weird structures, including even outright errors such as cycles in subclass graph, etc. And, of course, it changes all the time :)
-- Stas Malyshev smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 07.01.2017 15:27, Gerard Meijssen wrote:
Hoi, The biggest casualty of the current mess is that people like me do not care at all about it. It cannot be explained, nobody is interested in explaining it and consequently there is little use for it. It is "must have" so it is there.. fine, lets move on.
The subclass of and instance of statements are actually used in very many WDQS queries, often with * expressions to navigate the hierarchy. WDQ also had a special feature TREE for this purpose. So I'd say that this part of the data is rather important to Wikidata. But if you find little use in it, that's ok too. Nevertheless, we should try to fix the modelling errors there, since they will affect many other people's Wikidata experience.
Cheers,
Markus
On 7 January 2017 at 10:39, Markus Kroetzsch <markus.kroetzsch@tu-dresden.de mailto:markus.kroetzsch@tu-dresden.de> wrote:
On 06.01.2017 18:24, Thomas Douillard wrote: Same entity can be treated both as class and individual This is valid for OWL as well. Yes, and since Wikidata does not feature very powerful ontological statements, you could treat this like in OWL 2 DL semantically as well, i.e., a weak approach where the "class" and the "instance" are not really identified works. Nevertheless, using the ontology might still be challenging depending on what you want to do with it, since there are quite a few meta-levels (classes of classes of classes ...) that are not cleanly separated. When I last checked, we even had some instance-of cycles ;-) Even this is not a technical problem for the OWL semantics, but maybe for some tools and approaches. Cheers, Markus 2017-01-05 22:21 GMT+01:00 Stas Malyshev <smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org> <mailto:smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org>>>: Hi! > The best you can get in terms of "downloading the wikidata ontology" would be to > download all properties and all the items representing classes. We currently > don't have a separate dump for these. Also, do not expect this to be a concise > or consistent model that can be used for reasoning. You are bound to find > contradictions and lose ends. Also, Wikidata Toolkit (https://github.com/Wikidata/Wikidata-Toolkit <https://github.com/Wikidata/Wikidata-Toolkit> <https://github.com/Wikidata/Wikidata-Toolkit <https://github.com/Wikidata/Wikidata-Toolkit>>) can be used to generate something like taxonomy - see e.g. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html <http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html> <http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html <http://tools.wmflabs.org/wikidata-exports/rdf/exports/20160801/dump_download.html>> But one has to be careful with it as Wikidata may not (and frequently does not) follow assumptions that are true for proper OWL models - there are no limits on what can be considered a class, a subclass, an instance, etc. Same entity can be treated both as class and individual, and there may be some weird structures, including even outright errors such as cycles in subclass graph, etc. And, of course, it changes all the time :) -- Stas Malyshev smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org> <mailto:smalyshev@wikimedia.org <mailto:smalyshev@wikimedia.org>> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> <mailto:Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata> <https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata> _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata <https://lists.wikimedia.org/mailman/listinfo/wikidata>
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Markus Kroetzsch, 08/01/2017 00:12:
The subclass of and instance of statements are actually used in very many WDQS queries, often with * expressions to navigate the hierarchy.
I think that's what Gerard meant: you don't have to know what's under the hood, as long as it works. When you get some unexpected result, you go check what went wrong in the chain of subclasses etc. This is at least what I do, although I also work with some more traditional people who want to know the full ontology before even entering their first statement (of course they get lost for a few months).
Nemo
Hoi, What I mean and what I say is that there has been noone willing to explain why certain items are in there. When you ask questions it is seen as a threat and consequently I find I am treated like one. The consequence is that I do not care about the structure and totally ignore it. This is a shame because on occasion I do expect this has an impact. Thanks, GerardM
On 8 January 2017 at 00:15, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Markus Kroetzsch, 08/01/2017 00:12:
The subclass of and instance of statements are actually used in very many WDQS queries, often with * expressions to navigate the hierarchy.
I think that's what Gerard meant: you don't have to know what's under the hood, as long as it works. When you get some unexpected result, you go check what went wrong in the chain of subclasses etc. This is at least what I do, although I also work with some more traditional people who want to know the full ontology before even entering their first statement (of course they get lost for a few months).
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Gerard, I am willing to help. What do you want explained? (We can perhaps move that off-list.)
Med vänliga hälsningar Jan Ainali http://ainali.com
2017-01-08 8:52 GMT+01:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, What I mean and what I say is that there has been noone willing to explain why certain items are in there. When you ask questions it is seen as a threat and consequently I find I am treated like one. The consequence is that I do not care about the structure and totally ignore it. This is a shame because on occasion I do expect this has an impact. Thanks, GerardM
On 8 January 2017 at 00:15, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Markus Kroetzsch, 08/01/2017 00:12:
The subclass of and instance of statements are actually used in very many WDQS queries, often with * expressions to navigate the hierarchy.
I think that's what Gerard meant: you don't have to know what's under the hood, as long as it works. When you get some unexpected result, you go check what went wrong in the chain of subclasses etc. This is at least what I do, although I also work with some more traditional people who want to know the full ontology before even entering their first statement (of course they get lost for a few months).
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Hoi, It is in the logic. When a king is a monarch and a monarch is a politician I am fine. But when people insist that a "King of Iberia" is a subclass it does not make sense. People hold the office of and it is singular. When such things result in struggles, I think we have a problem.
I have asked in the past to explain the nonsense on items like monarch. When I look at Reasonator there is so much that is plain problematic that it is best to ignore it. What complicates it is that the ontology seems to end with politician and that is a travesty in and of itself. With other "occupations" there is a wealth of upper levels that seem to be completely arbitrary and when asked I find it reasonable that nobody steps up to explain because the consequences of answers are problematic. Thanks, GerardM
On 8 January 2017 at 10:27, Jan Ainali jan@aina.li wrote:
Gerard, I am willing to help. What do you want explained? (We can perhaps move that off-list.)
Med vänliga hälsningar Jan Ainali http://ainali.com
2017-01-08 8:52 GMT+01:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, What I mean and what I say is that there has been noone willing to explain why certain items are in there. When you ask questions it is seen as a threat and consequently I find I am treated like one. The consequence is that I do not care about the structure and totally ignore it. This is a shame because on occasion I do expect this has an impact. Thanks, GerardM
On 8 January 2017 at 00:15, Federico Leva (Nemo) nemowiki@gmail.com wrote:
Markus Kroetzsch, 08/01/2017 00:12:
The subclass of and instance of statements are actually used in very many WDQS queries, often with * expressions to navigate the hierarchy.
I think that's what Gerard meant: you don't have to know what's under the hood, as long as it works. When you get some unexpected result, you go check what went wrong in the chain of subclasses etc. This is at least what I do, although I also work with some more traditional people who want to know the full ontology before even entering their first statement (of course they get lost for a few months).
Nemo
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 09.01.2017 12:55, Gerard Meijssen wrote:
Hoi, It is in the logic. When a king is a monarch and a monarch is a politician I am fine. But when people insist that a "King of Iberia" is a subclass it does not make sense. People hold the office of and it is singular. When such things result in struggles, I think we have a problem.
I would say: "Every King of Iberia was also a king."
Only the "current king of Iberia" is a single person, but Wikidata is about all of history, so there are many such kings. The office of "King of Iberia" is still singular (it is a singular class) and it can have its own properties etc. I would therefore say (without having checked the page):
King of Iberia instance of office King of Iberia subclass of king
I have asked in the past to explain the nonsense on items like monarch. When I look at Reasonator there is so much that is plain problematic that it is best to ignore it. What complicates it is that the ontology seems to end with politician and that is a travesty in and of itself. With other "occupations" there is a wealth of upper levels that seem to be completely arbitrary and when asked I find it reasonable that nobody steps up to explain because the consequences of answers are problematic.
I agree that there are many cases that need to be modelled in a more coherent way. I can only imagine progress in this area to happen on a case-by-case basis. One really has to look into the details and check what works best in each case. Often there is no wrong or right here, but there is a choice how to model things. But once a choice is made, it should be applied coherently throughout.
Regards,
Markus
Hoi, My answer is that every past king is not a king but an office holder. For me the name king is just a label; there is no logic in the way it is applied. There are empires where the office holder is called a king for instance... For me they are all monarchs. They are associated with a specific country.
This prevents illogical things like "follows" on a person; for me they are qualifiers because all too often one person can have multiple predecessors and successors. It is the only sane way to do this (imho). Thanks, GerardM
On 9 January 2017 at 13:36, Markus Kroetzsch <markus.kroetzsch@tu-dresden.de
wrote:
On 09.01.2017 12:55, Gerard Meijssen wrote:
Hoi, It is in the logic. When a king is a monarch and a monarch is a politician I am fine. But when people insist that a "King of Iberia" is a subclass it does not make sense. People hold the office of and it is singular. When such things result in struggles, I think we have a problem.
I would say: "Every King of Iberia was also a king."
Only the "current king of Iberia" is a single person, but Wikidata is about all of history, so there are many such kings. The office of "King of Iberia" is still singular (it is a singular class) and it can have its own properties etc. I would therefore say (without having checked the page):
King of Iberia instance of office King of Iberia subclass of king
I have asked in the past to explain the nonsense on items like monarch. When I look at Reasonator there is so much that is plain problematic that it is best to ignore it. What complicates it is that the ontology seems to end with politician and that is a travesty in and of itself. With other "occupations" there is a wealth of upper levels that seem to be completely arbitrary and when asked I find it reasonable that nobody steps up to explain because the consequences of answers are problematic.
I agree that there are many cases that need to be modelled in a more coherent way. I can only imagine progress in this area to happen on a case-by-case basis. One really has to look into the details and check what works best in each case. Often there is no wrong or right here, but there is a choice how to model things. But once a choice is made, it should be applied coherently throughout.
Regards,
Markus
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch:
Only the "current king of Iberia" is a single person, but Wikidata is about all of history, so there are many such kings. The office of "King of Iberia" is still singular (it is a singular class) and it can have its own properties etc. I would therefore say (without having checked the page):
King of Iberia instance of office King of Iberia subclass of king
To be semantically strict, you would need to have two separate items, one for the office, and one for the class. Because the individual kinds have not been instances of the office - they have been holders of the office. And they have been instances of the class, but not holders of the class.
On wikidata, we often conflate these things for sake of simplicity. But when you try to write queries, this does not make things simpler, it makes it harder.
Anything that is a subclass of X, and at the same an instance of Y, where Y is not "class", is problematic. I think this is the root of the confusion Gerards speaks of.
Hoi, As I said before, there are only office holders. A person is not defined only by the office that he once held. Mr Obama is more than just the incumbent president of the United States. He is not defined by it. Thanks, GerardM
On 9 January 2017 at 16:20, Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch:
Only the "current king of Iberia" is a single person, but Wikidata is
about all
of history, so there are many such kings. The office of "King of Iberia"
is
still singular (it is a singular class) and it can have its own
properties etc.
I would therefore say (without having checked the page):
King of Iberia instance of office King of Iberia subclass of king
To be semantically strict, you would need to have two separate items, one for the office, and one for the class. Because the individual kinds have not been instances of the office - they have been holders of the office. And they have been instances of the class, but not holders of the class.
On wikidata, we often conflate these things for sake of simplicity. But when you try to write queries, this does not make things simpler, it makes it harder.
Anything that is a subclass of X, and at the same an instance of Y, where Y is not "class", is problematic. I think this is the root of the confusion Gerards speaks of.
-- Daniel Kinzler Senior Software Developer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On 01/09/2017 07:20 AM, Daniel Kinzler wrote:
Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch:
Only the "current king of Iberia" is a single person, but Wikidata is about all of history, so there are many such kings. The office of "King of Iberia" is still singular (it is a singular class) and it can have its own properties etc. I would therefore say (without having checked the page):
King of Iberia instance of office King of Iberia subclass of king
To be semantically strict, you would need to have two separate items, one for the office, and one for the class. Because the individual kinds have not been instances of the office - they have been holders of the office. And they have been instances of the class, but not holders of the class.
On wikidata, we often conflate these things for sake of simplicity. But when you try to write queries, this does not make things simpler, it makes it harder.
Anything that is a subclass of X, and at the same an instance of Y, where Y is not "class", is problematic. I think this is the root of the confusion Gerards speaks of.
There is no a priori reason that an office cannot be a class. Some formalisms don't allow this, but there are others that do. Some sets of rules for ontology construction don't allow this, but there are others that do. There is certainly no universal semantic consideration, even in any strict notion of semantics, that would require that there be two separate items here.
As far as I can tell, the Wikidata formalism is not one that would disallow offices being classes. As far as I can tell, the rules for constructing the Wikidata ontology don't disallow it either.
Peter F. Patel-Schneider Nuance Communications
I agree with Peter here. Daniel's statement of "Anything that is a subclass of X, and at the same an instance of Y, where Y is not "class", is problematic." is simply too strong. The classical example is Harry the eagle, and eagle being a species.
The following paper has a much more measured and subtle approach to this question:
http://snap.stanford.edu/wikiworkshop2016/papers/Wiki_Workshop__WWW_2016_pap...
I still think it is potentially and partially too strong, but certainly much better than Daniel's strict statement.
On Mon, Jan 9, 2017 at 7:58 AM Peter F. Patel-Schneider < pfpschneider@gmail.com> wrote:
On 01/09/2017 07:20 AM, Daniel Kinzler wrote:
Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch:
Only the "current king of Iberia" is a single person, but Wikidata is
about all
of history, so there are many such kings. The office of "King of
Iberia" is
still singular (it is a singular class) and it can have its own
properties etc.
I would therefore say (without having checked the page):
King of Iberia instance of office King of Iberia subclass of king
To be semantically strict, you would need to have two separate items,
one for
the office, and one for the class. Because the individual kinds have not
been
instances of the office - they have been holders of the office. And they
have
been instances of the class, but not holders of the class.
On wikidata, we often conflate these things for sake of simplicity. But
when you
try to write queries, this does not make things simpler, it makes it
harder.
Anything that is a subclass of X, and at the same an instance of Y,
where Y is
not "class", is problematic. I think this is the root of the confusion
Gerards
speaks of.
There is no a priori reason that an office cannot be a class. Some formalisms don't allow this, but there are others that do. Some sets of rules for ontology construction don't allow this, but there are others that do. There is certainly no universal semantic consideration, even in any strict notion of semantics, that would require that there be two separate items here.
As far as I can tell, the Wikidata formalism is not one that would disallow offices being classes. As far as I can tell, the rules for constructing the Wikidata ontology don't disallow it either.
Peter F. Patel-Schneider Nuance Communications
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Although there is no formal problem here, care does have to be taken when modelling entities that are to be considered as both classes and non-classes (or, and especially, metaclasses and non-metaclass classes). It is all too easy for even experienced modellers to make mistakes. The problem is worse when the modelling formalism is weak (as the Wikidata formalism is) and thus does not itself provide much support to detect mistakes. The problem is even worse when the modelling methodology often does not provide much description of the entities (as is the case in Wikidata).
The paper that Denny cites proposes that each entity be given a level (0 for non-class entities and some number greater than 0 for classes). The instance of relationship is limited so that it only relates entities to entities that are a single level higher and the subclass of relationship is limited to that it only relates entities within a single level. This rules out the problematic earthquake (Q7944), which used to be both an instance and a subclass of natural disaster (Q8065), and white (Q23444), which is currently both an instance and a subclass of color (Q1075). Although neither of these situations is a formal failure they are both almost certainly modelling failures.
It is, however, useful to be able to model entities that do not fit into this modelling methodology, like the class of all classes. These exceptions are, I think, rare.
Anyway, what this points out is that there are problems in how Wikidata models the world. Better guidelines on how to model on Wikidata would be useful. Strict rules, however, can easily prevent modelling what Wikidata should be modelling.
My suggestion is that Wikidata classes should have more information associated with them. It should be possible for a modeller to easily determine how a class is supposed to be used. This is not currently possible for color and I think is the main source of the problems with color.
Peter F. Patel-Schneider Nuance Communications
On 01/09/2017 10:28 AM, Denny Vrandečić wrote:
I agree with Peter here. Daniel's statement of "Anything that is a subclass of X, and at the same an instance of Y, where Y is not "class", is problematic." is simply too strong. The classical example is Harry the eagle, and eagle being a species.
The following paper has a much more measured and subtle approach to this question:
http://snap.stanford.edu/wikiworkshop2016/papers/Wiki_Workshop__WWW_2016_pap...
I still think it is potentially and partially too strong, but certainly much better than Daniel's strict statement.
On Mon, Jan 9, 2017 at 7:58 AM Peter F. Patel-Schneider <pfpschneider@gmail.com mailto:pfpschneider@gmail.com> wrote:
On 01/09/2017 07:20 AM, Daniel Kinzler wrote: > Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch: >> Only the "current king of Iberia" is a single person, but Wikidata is about all >> of history, so there are many such kings. The office of "King of Iberia" is >> still singular (it is a singular class) and it can have its own properties etc. >> I would therefore say (without having checked the page): >> >> King of Iberia instance of office >> King of Iberia subclass of king > > To be semantically strict, you would need to have two separate items, one for > the office, and one for the class. Because the individual kinds have not been > instances of the office - they have been holders of the office. And they have > been instances of the class, but not holders of the class. > > On wikidata, we often conflate these things for sake of simplicity. But when you > try to write queries, this does not make things simpler, it makes it harder. > > Anything that is a subclass of X, and at the same an instance of Y, where Y is > not "class", is problematic. I think this is the root of the confusion Gerards > speaks of. There is no a priori reason that an office cannot be a class. Some formalisms don't allow this, but there are others that do. Some sets of rules for ontology construction don't allow this, but there are others that do. There is certainly no universal semantic consideration, even in any strict notion of semantics, that would require that there be two separate items here. As far as I can tell, the Wikidata formalism is not one that would disallow offices being classes. As far as I can tell, the rules for constructing the Wikidata ontology don't disallow it either. Peter F. Patel-Schneider Nuance Communications _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Databases use schema but that doesn't mean they make sense for Humans all the time. Rules are typically used to find gaps in the data and the schema.
In Freebase, we decided to handle things such as this with Rules (as Peter leans towards), where the community would help with developing them and then we'd generate lists to have folks work on correcting the wrongful data or schema. It was all Class based rules since Freebase had multiple Classes that could be applied to a topic or entity.
I don't like Instance Of and never use it...ever...and never assert statements about it, or ask questions involving Instance Of with SPARQL...I find that WD life is easier without it.
On Mon, Jan 9, 2017 at 1:17 PM Peter F. Patel-Schneider < pfpschneider@gmail.com> wrote:
Although there is no formal problem here, care does have to be taken when modelling entities that are to be considered as both classes and non-classes (or, and especially, metaclasses and non-metaclass classes). It is all too easy for even experienced modellers to make mistakes. The problem is worse when the modelling formalism is weak (as the Wikidata formalism is) and thus does not itself provide much support to detect mistakes. The problem is even worse when the modelling methodology often does not provide much description of the entities (as is the case in Wikidata).
The paper that Denny cites proposes that each entity be given a level (0 for non-class entities and some number greater than 0 for classes). The instance of relationship is limited so that it only relates entities to entities that are a single level higher and the subclass of relationship is limited to that it only relates entities within a single level. This rules out the problematic earthquake (Q7944), which used to be both an instance and a subclass of natural disaster (Q8065), and white (Q23444), which is currently both an instance and a subclass of color (Q1075). Although neither of these situations is a formal failure they are both almost certainly modelling failures.
It is, however, useful to be able to model entities that do not fit into this modelling methodology, like the class of all classes. These exceptions are, I think, rare.
Anyway, what this points out is that there are problems in how Wikidata models the world. Better guidelines on how to model on Wikidata would be useful. Strict rules, however, can easily prevent modelling what Wikidata should be modelling.
My suggestion is that Wikidata classes should have more information associated with them. It should be possible for a modeller to easily determine how a class is supposed to be used. This is not currently possible for color and I think is the main source of the problems with color.
Peter F. Patel-Schneider Nuance Communications
On 01/09/2017 10:28 AM, Denny Vrandečić wrote:
I agree with Peter here. Daniel's statement of "Anything that is a
subclass of
X, and at the same an instance of Y, where Y is not "class", is
problematic."
is simply too strong. The classical example is Harry the eagle, and eagle being a species.
The following paper has a much more measured and subtle approach to this
question:
http://snap.stanford.edu/wikiworkshop2016/papers/Wiki_Workshop__WWW_2016_pap...
I still think it is potentially and partially too strong, but certainly
much
better than Daniel's strict statement.
On Mon, Jan 9, 2017 at 7:58 AM Peter F. Patel-Schneider <pfpschneider@gmail.com mailto:pfpschneider@gmail.com> wrote:
On 01/09/2017 07:20 AM, Daniel Kinzler wrote: > Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch: >> Only the "current king of Iberia" is a single person, but
Wikidata is
about all >> of history, so there are many such kings. The office of "King of
Iberia" is
>> still singular (it is a singular class) and it can have its own properties etc. >> I would therefore say (without having checked the page): >> >> King of Iberia instance of office >> King of Iberia subclass of king > > To be semantically strict, you would need to have two separate
items,
one for > the office, and one for the class. Because the individual kinds
have not
been > instances of the office - they have been holders of the office.
And they
have > been instances of the class, but not holders of the class. > > On wikidata, we often conflate these things for sake of
simplicity. But
when you > try to write queries, this does not make things simpler, it makes
it harder.
> > Anything that is a subclass of X, and at the same an instance of Y, where Y is > not "class", is problematic. I think this is the root of the
confusion
Gerards > speaks of. There is no a priori reason that an office cannot be a class. Some
formalisms
don't allow this, but there are others that do. Some sets of rules
for
ontology construction don't allow this, but there are others that
do. There
is certainly no universal semantic consideration, even in any strict
notion of
semantics, that would require that there be two separate items here. As far as I can tell, the Wikidata formalism is not one that would
disallow
offices being classes. As far as I can tell, the rules for
constructing the
Wikidata ontology don't disallow it either. Peter F. Patel-Schneider Nuance Communications _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Further idea... They were called Mutexes in Freebase and actually lived as part of its data. In fact, you could actually use those same rules in WD if you wanted to. In Freebase, if someone typed or classed a Musical Artist as a fictional character also...they would immediately get a warning in the GUI that they could not type as Fictional Character also.
"Musical Artist != Fictional Character" https://groups.google.com/forum/#!topic/freebase-discuss/h0LgDZIL6N4
On Mon, Jan 9, 2017 at 2:38 PM Thad Guidry thadguidry@gmail.com wrote:
Databases use schema but that doesn't mean they make sense for Humans all the time. Rules are typically used to find gaps in the data and the schema.
In Freebase, we decided to handle things such as this with Rules (as Peter leans towards), where the community would help with developing them and then we'd generate lists to have folks work on correcting the wrongful data or schema. It was all Class based rules since Freebase had multiple Classes that could be applied to a topic or entity.
I don't like Instance Of and never use it...ever...and never assert statements about it, or ask questions involving Instance Of with SPARQL...I find that WD life is easier without it.
On Mon, Jan 9, 2017 at 1:17 PM Peter F. Patel-Schneider < pfpschneider@gmail.com> wrote:
Although there is no formal problem here, care does have to be taken when modelling entities that are to be considered as both classes and non-classes (or, and especially, metaclasses and non-metaclass classes). It is all too easy for even experienced modellers to make mistakes. The problem is worse when the modelling formalism is weak (as the Wikidata formalism is) and thus does not itself provide much support to detect mistakes. The problem is even worse when the modelling methodology often does not provide much description of the entities (as is the case in Wikidata).
The paper that Denny cites proposes that each entity be given a level (0 for non-class entities and some number greater than 0 for classes). The instance of relationship is limited so that it only relates entities to entities that are a single level higher and the subclass of relationship is limited to that it only relates entities within a single level. This rules out the problematic earthquake (Q7944), which used to be both an instance and a subclass of natural disaster (Q8065), and white (Q23444), which is currently both an instance and a subclass of color (Q1075). Although neither of these situations is a formal failure they are both almost certainly modelling failures.
It is, however, useful to be able to model entities that do not fit into this modelling methodology, like the class of all classes. These exceptions are, I think, rare.
Anyway, what this points out is that there are problems in how Wikidata models the world. Better guidelines on how to model on Wikidata would be useful. Strict rules, however, can easily prevent modelling what Wikidata should be modelling.
My suggestion is that Wikidata classes should have more information associated with them. It should be possible for a modeller to easily determine how a class is supposed to be used. This is not currently possible for color and I think is the main source of the problems with color.
Peter F. Patel-Schneider Nuance Communications
On 01/09/2017 10:28 AM, Denny Vrandečić wrote:
I agree with Peter here. Daniel's statement of "Anything that is a
subclass of
X, and at the same an instance of Y, where Y is not "class", is
problematic."
is simply too strong. The classical example is Harry the eagle, and eagle being a species.
The following paper has a much more measured and subtle approach to this
question:
http://snap.stanford.edu/wikiworkshop2016/papers/Wiki_Workshop__WWW_2016_pap...
I still think it is potentially and partially too strong, but certainly
much
better than Daniel's strict statement.
On Mon, Jan 9, 2017 at 7:58 AM Peter F. Patel-Schneider <pfpschneider@gmail.com mailto:pfpschneider@gmail.com> wrote:
On 01/09/2017 07:20 AM, Daniel Kinzler wrote: > Am 09.01.2017 um 04:36 schrieb Markus Kroetzsch: >> Only the "current king of Iberia" is a single person, but
Wikidata is
about all >> of history, so there are many such kings. The office of "King of
Iberia" is
>> still singular (it is a singular class) and it can have its own properties etc. >> I would therefore say (without having checked the page): >> >> King of Iberia instance of office >> King of Iberia subclass of king > > To be semantically strict, you would need to have two separate
items,
one for > the office, and one for the class. Because the individual kinds
have not
been > instances of the office - they have been holders of the office.
And they
have > been instances of the class, but not holders of the class. > > On wikidata, we often conflate these things for sake of
simplicity. But
when you > try to write queries, this does not make things simpler, it makes
it harder.
> > Anything that is a subclass of X, and at the same an instance of Y, where Y is > not "class", is problematic. I think this is the root of the
confusion
Gerards > speaks of. There is no a priori reason that an office cannot be a class. Some
formalisms
don't allow this, but there are others that do. Some sets of rules
for
ontology construction don't allow this, but there are others that
do. There
is certainly no universal semantic consideration, even in any strict
notion of
semantics, that would require that there be two separate items here. As far as I can tell, the Wikidata formalism is not one that would
disallow
offices being classes. As far as I can tell, the rules for
constructing the
Wikidata ontology don't disallow it either. Peter F. Patel-Schneider Nuance Communications _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
Am 09.01.2017 um 11:16 schrieb Peter F. Patel-Schneider:
Although there is no formal problem here, care does have to be taken when modelling entities that are to be considered as both classes and non-classes (or, and especially, metaclasses and non-metaclass classes). It is all too easy for even experienced modellers to make mistakes. The problem is worse when the modelling formalism is weak (as the Wikidata formalism is) and thus does not itself provide much support to detect mistakes. The problem is even worse when the modelling methodology often does not provide much description of the entities (as is the case in Wikidata).
That's what I meant by "problematic". I did not mean to say it's *wrong* per se.