Dear all,
First, sorry for sending an email: I want to help, but I don't have the time required to understand how the wiki RfC mechanism work [1]. More precisely that one seems really not the appropriate for a first dive :-(
In fact reading it I'm not even sure I understand the question anymore. To me the original question was about the properties P31 and P279 themselves (Eric's mail still list them as an option, albeit a popular one), ie, rather on how to represent a classification (independent from which one is chosen). But now I see plenty hardcore ontological discussions on the RfC page, which are indeed about getting a unified top-level ontology...
The basic question is, can you really get a unified, perfectly structured and clean classification of things? I'm slightly surprised that Wikidata would go there. You want users to add classes in the future, no? Or to use the existing wikipedia categories as a source of classification? In either case, you'd end up making weird inferences possible, if you apply the formal semantics of P31 and P279 as they're defined for rdf:type and rdfs:subClassOf [4,5]. Actually even if you invest time making a clean top-level, the lower-level parts of the classification will probably very soon diverge from formal ontology "meta-principles" that structure SUMO, DOLCE, BFO, etc.
And it's probably very alright, for most of your usage scenarios. Having simple, intuitive classification semantics is possible without the full formal ontology apparatus. Namely, you can use something that looks like rdf:type/rdfs:subClassOf, but with looser semantics. 1. you could use something like the dc:type property from the Dublin Core framework, instead of rdf:type. Possibly creating sub-properties of it, using a list like the one at [7] for input. 2. You could use something like skos:broader and skos:narrower [8] for the links between the 'looser classes'
Of course this does not correspond to formal ontological framework as in the Semantic Web sense. But well, if the 'classification' doesn't fit a super-formal framework, I see no reason to desperately try to shoehorn it into RDFS.
Note that I would quite disagree with the second part of the sentence from one of the RfC-related pages [9]: " There is a consensus on Wikidata against creating other properties which perform this function as it is felt a clean hierarchy of classes is in keeping with W3C recommendations and will make it easier to use the data here. " First, getting a clean hierarchy won't make things easier, if you end up with a too static/formal view on the world. Second, the feeling about the W3C recommendations is wrong. W3C has actually pushed SKOS to allow 'softer' classifications to be represented having to undergo the ordeals and dangers of RDFS/OWL...
But I realize all this might be regarded as questioning the decision you made earlier on using P31 and P279 instead of the GND type, so I'm going to stop bothering you ;-)
Best,
Antoine --- Antoine Isaac Scientific coordinator, Europeana.eu
[1] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_f... [2] http://lists.wikimedia.org/pipermail/wikidata-l/2013-September/002815.html [3] http://lists.wikimedia.org/pipermail/wikidata-l/2013-September/002816.html [4] http://www.w3.org/TR/rdf-schema/#ch_type [5] http://www.w3.org/TR/rdf-schema/#ch_subclassof [6] http://purl.org/dc/terms/type [7] https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Migrating_away_f... [8] http://www.w3.org/TR/skos-primer/#secrel [9] https://www.wikidata.org/wiki/Help:Modeling#Hierarchy_of_classes
On 22 September 2013 at 21:24:48, Antoine Isaac (aisaac@few.vu.nl) wrote:
First, getting a clean hierarchy won't make things easier, if you end up with a too static/formal view on the world. Second, the feeling about the W3C recommendations is wrong. W3C has actually pushed SKOS to allow 'softer' classifications to be represented having to undergo the ordeals and dangers of RDFS/OWL...
But I realize all this might be regarded as questioning the decision you made earlier on using P31 and P279 instead of the GND type, so I'm going to stop bothering you ;-)
Agreed with some of that.
The primary problem with GND type is that it tried to reduce the whole world into 7 arbitrary categories. I'm not sure how any proposed alternative could be as barmy as that.
My general preference is towards simple, unfussy and bubble-up types, looking at existing systems that work and following them as far as possible (and, no, big formal ontology systems do not satisfy the "that work" part of that sentence, nor do indulgent academic thought experiments).
We don't need big-design-up-front. We need to apply common sense, the Pareto Principle and avoid the excesses of the academic AI/KR community that have thus far made things like the RDF/OWL spec impenetrable to the average person who doesn't know what "model-theoretic semantics" are. (And also left us in a state where we have endless specs on ontological minutiae, but nobody seems to be bothered about fixing datatypes.)
Indeed, one of the things that's good about RDF is precisely that because you use URIs to define properties and classes, you can delegate the creation of those classes and properties to subject matter experts. The biology/medicine people design the schemata they need to represent genes and drugs and so on; if I need a simple property to represent dietary preference, I just coin it and start publishing. On Wikidata, rather than trying to suppose that the ontology people have solved all the problems, it'd be much better if we followed actual usage and unified our semantics with others using things like owl:sameAs and equivalentProperty relations.
If I had to suggest some design principles, these would be where I start:
1. Prioritise pragmatism and common sense over theoretical unity.
2. Categorisation schemes are used by humans and implemented by humans. Design for humans rather than for hyper-intelligent robots or geniuses.
3. Actual usage takes priority over hypothetical use cases.
4. Use by Wikimedia projects takes priority over use by third parties.
5. Optimise for common use cases per Pareto's Principle.
6. You can apply two different types to something. Avoid creating union types. Wikipedia may have "Jewish LGBT scientists from Portugal with a cleft lip", but we don't need to replicate that kind of silliness.
7. If explaining your proposed category/property/schema to the man on the Clapham Omnibus would cause him to laugh to the point where it would disturb his fellow travellers, you need to rethink your proposal.
8. Take your necktie off. You are designing a fancy computer index card system, not going to meet the Queen of England.
The most amusing thing in the GND discussions (beyond the hilarious defences of how the absurd way the GND categorises fictional characters, planets, families and so on is actually okay) were people predicting anarchy if we didn't strictly follow some kind of schema designed by librarians. It's almost as if Wikipedia hadn't happened: the same people would have been saying back in 2001 that an encyclopedia written by random volunteers on the Internet would be impossible and the anarchic dream of pot-smoking hippies.
-- Tom Morris http://tommorris.org/
Tom,
I totally agree with your sentiments here. Two questions.
Do you believe there is any valuable use for upper ontologies in the wikidata system at all at this stage?
Could you describe how you see a bubble-up classification scheme working in this context in a little detail? I can imagine scenarios, but you've thought about it longer..
-Ben
On Tue, Sep 24, 2013 at 12:50 AM, Tom Morris tom@tommorris.org wrote:
On 22 September 2013 at 21:24:48, Antoine Isaac (aisaac@few.vu.nl) wrote:
First, getting a clean hierarchy won't make things easier, if you end up with a too static/formal view on the world. Second, the feeling about the W3C recommendations is wrong. W3C has actually pushed SKOS to allow 'softer' classifications to be represented having to undergo the ordeals and dangers of RDFS/OWL...
But I realize all this might be regarded as questioning the decision you made earlier on using P31 and P279 instead of the GND type, so I'm going to stop bothering you ;-)
Agreed with some of that.
The primary problem with GND type is that it tried to reduce the whole world into 7 arbitrary categories. I'm not sure how any proposed alternative could be as barmy as that.
My general preference is towards simple, unfussy and bubble-up types, looking at existing systems that work and following them as far as possible (and, no, big formal ontology systems do not satisfy the "that work" part of that sentence, nor do indulgent academic thought experiments).
We don't need big-design-up-front. We need to apply common sense, the Pareto Principle and avoid the excesses of the academic AI/KR community that have thus far made things like the RDF/OWL spec impenetrable to the average person who doesn't know what "model-theoretic semantics" are. (And also left us in a state where we have endless specs on ontological minutiae, but nobody seems to be bothered about fixing datatypes.)
Indeed, one of the things that's good about RDF is precisely that because you use URIs to define properties and classes, you can delegate the creation of those classes and properties to subject matter experts. The biology/medicine people design the schemata they need to represent genes and drugs and so on; if I need a simple property to represent dietary preference, I just coin it and start publishing. On Wikidata, rather than trying to suppose that the ontology people have solved all the problems, it'd be much better if we followed actual usage and unified our semantics with others using things like owl:sameAs and equivalentProperty relations.
If I had to suggest some design principles, these would be where I start:
Prioritise pragmatism and common sense over theoretical unity.
Categorisation schemes are used by humans and implemented by humans.
Design for humans rather than for hyper-intelligent robots or geniuses.
Actual usage takes priority over hypothetical use cases.
Use by Wikimedia projects takes priority over use by third parties.
Optimise for common use cases per Pareto's Principle.
You can apply two different types to something. Avoid creating union
types. Wikipedia may have "Jewish LGBT scientists from Portugal with a cleft lip", but we don't need to replicate that kind of silliness.
- If explaining your proposed category/property/schema to the man on the
Clapham Omnibus would cause him to laugh to the point where it would disturb his fellow travellers, you need to rethink your proposal.
- Take your necktie off. You are designing a fancy computer index card
system, not going to meet the Queen of England.
The most amusing thing in the GND discussions (beyond the hilarious defences of how the absurd way the GND categorises fictional characters, planets, families and so on is actually okay) were people predicting anarchy if we didn't strictly follow some kind of schema designed by librarians. It's almost as if Wikipedia hadn't happened: the same people would have been saying back in 2001 that an encyclopedia written by random volunteers on the Internet would be impossible and the anarchic dream of pot-smoking hippies.
-- Tom Morris http://tommorris.org/
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hoi,
When you read about upper ontologies, it says that the answer is highly political. We have already suffered the pain of using the DNB library system. For all our items with a GND identifier we can lookup what the "main type (GND)" is. We have identifiers to many external sources and for all of them we can lookup how they fit in the classification scheme used by these external sources.
All in all, if there is a point to any of them, we can learn how they fit in in an external classification system. <grin> is this not the point of having all these external identifiers? </grin> Thanks, GerardM
On 24 September 2013 17:27, Benjamin Good ben.mcgee.good@gmail.com wrote:
Tom,
I totally agree with your sentiments here. Two questions.
Do you believe there is any valuable use for upper ontologies in the wikidata system at all at this stage?
Could you describe how you see a bubble-up classification scheme working in this context in a little detail? I can imagine scenarios, but you've thought about it longer..
-Ben
On Tue, Sep 24, 2013 at 12:50 AM, Tom Morris tom@tommorris.org wrote:
On 22 September 2013 at 21:24:48, Antoine Isaac (aisaac@few.vu.nl) wrote:
First, getting a clean hierarchy won't make things easier, if you end up with a too static/formal view on the world. Second, the feeling about the W3C recommendations is wrong. W3C has actually pushed SKOS to allow 'softer' classifications to be represented having to undergo the ordeals and dangers of RDFS/OWL...
But I realize all this might be regarded as questioning the decision you made earlier on using P31 and P279 instead of the GND type, so I'm going to stop bothering you ;-)
Agreed with some of that.
The primary problem with GND type is that it tried to reduce the whole world into 7 arbitrary categories. I'm not sure how any proposed alternative could be as barmy as that.
My general preference is towards simple, unfussy and bubble-up types, looking at existing systems that work and following them as far as possible (and, no, big formal ontology systems do not satisfy the "that work" part of that sentence, nor do indulgent academic thought experiments).
We don't need big-design-up-front. We need to apply common sense, the Pareto Principle and avoid the excesses of the academic AI/KR community that have thus far made things like the RDF/OWL spec impenetrable to the average person who doesn't know what "model-theoretic semantics" are. (And also left us in a state where we have endless specs on ontological minutiae, but nobody seems to be bothered about fixing datatypes.)
Indeed, one of the things that's good about RDF is precisely that because you use URIs to define properties and classes, you can delegate the creation of those classes and properties to subject matter experts. The biology/medicine people design the schemata they need to represent genes and drugs and so on; if I need a simple property to represent dietary preference, I just coin it and start publishing. On Wikidata, rather than trying to suppose that the ontology people have solved all the problems, it'd be much better if we followed actual usage and unified our semantics with others using things like owl:sameAs and equivalentProperty relations.
If I had to suggest some design principles, these would be where I start:
Prioritise pragmatism and common sense over theoretical unity.
Categorisation schemes are used by humans and implemented by humans.
Design for humans rather than for hyper-intelligent robots or geniuses.
Actual usage takes priority over hypothetical use cases.
Use by Wikimedia projects takes priority over use by third parties.
Optimise for common use cases per Pareto's Principle.
You can apply two different types to something. Avoid creating union
types. Wikipedia may have "Jewish LGBT scientists from Portugal with a cleft lip", but we don't need to replicate that kind of silliness.
- If explaining your proposed category/property/schema to the man on the
Clapham Omnibus would cause him to laugh to the point where it would disturb his fellow travellers, you need to rethink your proposal.
- Take your necktie off. You are designing a fancy computer index card
system, not going to meet the Queen of England.
The most amusing thing in the GND discussions (beyond the hilarious defences of how the absurd way the GND categorises fictional characters, planets, families and so on is actually okay) were people predicting anarchy if we didn't strictly follow some kind of schema designed by librarians. It's almost as if Wikipedia hadn't happened: the same people would have been saying back in 2001 that an encyclopedia written by random volunteers on the Internet would be impossible and the anarchic dream of pot-smoking hippies.
-- Tom Morris http://tommorris.org/
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l