Subject: [Wikidata-l] 'Person' or 'human', upper ontologies and migrating 4 million claims
Antoine
while there there are discussions in the RFC about high level ontlogies there is other stuff happening out on the wikidata item pages.
Editors are constructing low level ontologies using ''instance of' and 'subclass of' and these are gradually creeping upwards.
'is in administrative unit' and 'located on terrain feature' are being used to build another hierarchy of places on earth and 'part of' is being used to build a hierarchy of places off the planet.
'occupation (person)' is becoming more important than 'instance of' in classifying humans and 'child'
'instance of' is also being used to classify all the items derived from wikipedia pages that don't quite fit - category pages, disambiguation pages, compound items (describing more than one thing - like 'Bonnie and Clyde'), so tools can find these to exclude them from queries or whatever.
Personally I can't see an awful lot of use for an upper level ontology - all the use cases I've seen are for the lower levels. If an upper level is to be added (and I'm sure it will - 'encyclopaedic' is close to a synonym for 'completist') then why not have all of the upper level ontologies? 'subclass of' can be used to create a variety of upper level ontologies on top of the base levels derived from the items we have. After all the enwp categories have three different upper level ontologies!
Joe user:filceolaire
On Mon, Sep 23, 2013 at 1:00 PM, wikidata-l-request@lists.wikimedia.orgwrote:
Send Wikidata-l mailing list submissions to wikidata-l@lists.wikimedia.org
To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wikidata-l or, via email, send a message with subject or body 'help' to wikidata-l-request@lists.wikimedia.org
You can reach the person managing the list at wikidata-l-owner@lists.wikimedia.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Wikidata-l digest..."
Today's Topics:
- 'Person' or 'human', upper ontologies and migrating 4 million claims (Antoine Isaac)
Message: 1 Date: Sun, 22 Sep 2013 22:24:32 +0200 From: Antoine Isaac aisaac@few.vu.nl To: wikidata-l@lists.wikimedia.org Subject: [Wikidata-l] 'Person' or 'human', upper ontologies and migrating 4 million claims Message-ID: 523F5200.7080704@few.vu.nl Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Dear all,
First, sorry for sending an email: I want to help, but I don't have the time required to understand how the wiki RfC mechanism work [1]. More precisely that one seems really not the appropriate for a first dive :-(
In fact reading it I'm not even sure I understand the question anymore. To me the original question was about the properties P31 and P279 themselves (Eric's mail still list them as an option, albeit a popular one), ie, rather on how to represent a classification (independent from which one is chosen). But now I see plenty hardcore ontological discussions on the RfC page, which are indeed about getting a unified top-level ontology...
The basic question is, can you really get a unified, perfectly structured and clean classification of things? I'm slightly surprised that Wikidata would go there. You want users to add classes in the future, no? Or to use the existing wikipedia categories as a source of classification? In either case, you'd end up making weird inferences possible, if you apply the formal semantics of P31 and P279 as they're defined for rdf:type and rdfs:subClassOf [4,5]. Actually even if you invest time making a clean top-level, the lower-level parts of the classification will probably very soon diverge from formal ontology "meta-principles" that structure SUMO, DOLCE, BFO, etc.
And it's probably very alright, for most of your usage scenarios. Having simple, intuitive classification semantics is possible without the full formal ontology apparatus. Namely, you can use something that looks like rdf:type/rdfs:subClassOf, but with looser semantics.
- you could use something like the dc:type property from the Dublin Core
framework, instead of rdf:type. Possibly creating sub-properties of it, using a list like the one at [7] for input. 2. You could use something like skos:broader and skos:narrower [8] for the links between the 'looser classes'
Of course this does not correspond to formal ontological framework as in the Semantic Web sense. But well, if the 'classification' doesn't fit a super-formal framework, I see no reason to desperately try to shoehorn it into RDFS.
Note that I would quite disagree with the second part of the sentence from one of the RfC-related pages [9]: " There is a consensus on Wikidata against creating other properties which perform this function as it is felt a clean hierarchy of classes is in keeping with W3C recommendations and will make it easier to use the data here. " First, getting a clean hierarchy won't make things easier, if you end up with a too static/formal view on the world. Second, the feeling about the W3C recommendations is wrong. W3C has actually pushed SKOS to allow 'softer' classifications to be represented having to undergo the ordeals and dangers of RDFS/OWL...
But I realize all this might be regarded as questioning the decision you made earlier on using P31 and P279 instead of the GND type, so I'm going to stop bothering you ;-)
Best,
Antoine
Antoine Isaac Scientific coordinator, Europeana.eu
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_f... [2] http://lists.wikimedia.org/pipermail/wikidata-l/2013-September/002815.html [3] http://lists.wikimedia.org/pipermail/wikidata-l/2013-September/002816.html [4] http://www.w3.org/TR/rdf-schema/#ch_type [5] http://www.w3.org/TR/rdf-schema/#ch_subclassof [6] http://purl.org/dc/terms/type [7] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_f... [8] http://www.w3.org/TR/skos-primer/#secrel [9] https://www.wikidata.org/wiki/Help:Modeling#Hierarchy_of_classes
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
End of Wikidata-l Digest, Vol 22, Issue 22
Joe,
I think we're not far from agreeing. My proposal was not about endorsing one upper-level ontology, but be ready to have different things with this status. And probably things of dubious "ontological quality". (and yes I believe it's quite alright, useful stuff can still be implemented on top of that).
Antoine
Subject: [Wikidata-l] 'Person' or 'human', upper ontologies and migrating 4 million claims
Antoine
while there there are discussions in the RFC about high level ontlogies there is other stuff happening out on the wikidata item pages.
Editors are constructing low level ontologies using ''instance of' and 'subclass of' and these are gradually creeping upwards.
'is in administrative unit' and 'located on terrain feature' are being used to build another hierarchy of places on earth and 'part of' is being used to build a hierarchy of places off the planet.
'occupation (person)' is becoming more important than 'instance of' in classifying humans and 'child'
'instance of' is also being used to classify all the items derived from wikipedia pages that don't quite fit - category pages, disambiguation pages, compound items (describing more than one thing - like 'Bonnie and Clyde'), so tools can find these to exclude them from queries or whatever.
Personally I can't see an awful lot of use for an upper level ontology - all the use cases I've seen are for the lower levels. If an upper level is to be added (and I'm sure it will - 'encyclopaedic' is close to a synonym for 'completist') then why not have all of the upper level ontologies? 'subclass of' can be used to create a variety of upper level ontologies on top of the base levels derived from the items we have. After all the enwp categories have three different upper level ontologies!
Joe user:filceolaire
On Mon, Sep 23, 2013 at 1:00 PM, <wikidata-l-request@lists.wikimedia.org mailto:wikidata-l-request@lists.wikimedia.org> wrote:
---------------------------------------------------------------------- Message: 1 Date: Sun, 22 Sep 2013 22:24:32 +0200 From: Antoine Isaac <aisaac@few.vu.nl <mailto:aisaac@few.vu.nl>> To: <wikidata-l@lists.wikimedia.org <mailto:wikidata-l@lists.wikimedia.org>> Subject: [Wikidata-l] 'Person' or 'human', upper ontologies and migrating 4 million claims Message-ID: <523F5200.7080704@few.vu.nl <mailto:523F5200.7080704@few.vu.nl>> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Dear all, First, sorry for sending an email: I want to help, but I don't have the time required to understand how the wiki RfC mechanism work [1]. More precisely that one seems really not the appropriate for a first dive :-( In fact reading it I'm not even sure I understand the question anymore. To me the original question was about the properties P31 and P279 themselves (Eric's mail still list them as an option, albeit a popular one), ie, rather on how to represent a classification (independent from which one is chosen). But now I see plenty hardcore ontological discussions on the RfC page, which are indeed about getting a unified top-level ontology... The basic question is, can you really get a unified, perfectly structured and clean classification of things? I'm slightly surprised that Wikidata would go there. You want users to add classes in the future, no? Or to use the existing wikipedia categories as a source of classification? In either case, you'd end up making weird inferences possible, if you apply the formal semantics of P31 and P279 as they're defined for rdf:type and rdfs:subClassOf [4,5]. Actually even if you invest time making a clean top-level, the lower-level parts of the classification will probably very soon diverge from formal ontology "meta-principles" that structure SUMO, DOLCE, BFO, etc. And it's probably very alright, for most of your usage scenarios. Having simple, intuitive classification semantics is possible without the full formal ontology apparatus. Namely, you can use something that looks like rdf:type/rdfs:subClassOf, but with looser semantics. 1. you could use something like the dc:type property from the Dublin Core framework, instead of rdf:type. Possibly creating sub-properties of it, using a list like the one at [7] for input. 2. You could use something like skos:broader and skos:narrower [8] for the links between the 'looser classes' Of course this does not correspond to formal ontological framework as in the Semantic Web sense. But well, if the 'classification' doesn't fit a super-formal framework, I see no reason to desperately try to shoehorn it into RDFS. Note that I would quite disagree with the second part of the sentence from one of the RfC-related pages [9]: " There is a consensus on Wikidata against creating other properties which perform this function as it is felt a clean hierarchy of classes is in keeping with W3C recommendations and will make it easier to use the data here. " First, getting a clean hierarchy won't make things easier, if you end up with a too static/formal view on the world. Second, the feeling about the W3C recommendations is wrong. W3C has actually pushed SKOS to allow 'softer' classifications to be represented having to undergo the ordeals and dangers of RDFS/OWL... But I realize all this might be regarded as questioning the decision you made earlier on using P31 and P279 instead of the GND type, so I'm going to stop bothering you ;-) Best, Antoine --- Antoine Isaac Scientific coordinator, Europeana.eu [1] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_from_GND_main_type [2] http://lists.wikimedia.org/pipermail/wikidata-l/2013-September/002815.html [3] http://lists.wikimedia.org/pipermail/wikidata-l/2013-September/002816.html [4] http://www.w3.org/TR/rdf-schema/#ch_type [5] http://www.w3.org/TR/rdf-schema/#ch_subclassof [6] http://purl.org/dc/terms/type [7] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_from_GND_main_type#List_of_specialized_type_properties [8] http://www.w3.org/TR/skos-primer/#secrel [9] https://www.wikidata.org/wiki/Help:Modeling#Hierarchy_of_classes