Re: [Wikidata-l] Question about wikipedia categories.

10 May 2013


      Jane Darnell suggestged:
...
I think it is a perfectly good and noble ambition to strive for "a
logically sound ontology as contrasted with a controlled terminology".
I just don't believe it is attainable.
Logically sound ontologies have been built and used for years - they are not only possible, but multiple examples exist.  The CYC ontology (under development since 1985) has over 100,000 categories, and has been used commercially on large projects, and is well-structured and exhaustively tested.
Now, if one wants to say that "an ontology of everything" cannot be built, because of logically incompatible views, beliefs, or assumptions, then that may be true if one assumes that *all* assertions in an ontology must be logically consistent and contained within a single theory, but one needs to understand that logically incompatible theories *can* be included in a single sound ontology, because they can be circumscribed, isolated, and *specified* as logically incompatible theories, and the reasoner would never attempt to include any two incompatible theories in a reasoning process.  One may structure those incompatible theories in various ways, such as in sub-ontologies (or CYC microtheories).  And the reasoner can interpret all those theories, based on the foundation categories of the ontology, which are self-consistent.  Humans have many incompatible beliefs, but most people can "understand" the differing beliefs of religious systems, even while agreeing with only a few or even zero (by not accepting assumptions or the reasoning process of those belief systems).  Computers can "interpret" the facts asserted in an ontology, though not yet as deeply as people; but our goal is for the computers to "interpret" assertions (i.e. to recognize implications) so that it can reason with them, to the extent that they can be reasoned with.  OF course, at this stage, computers do not "understand" to the depth that people do, they can only use whatever is asserted or inferrible from the assertions they have been given; but that amount of information is still absolutely massive.  We and our computers know the pitfalls of logical inconsistency and know how to avoid them.  The point is, that we can do a great deal of valid and useful reasoning within self-consistent theories (which is why CYC is organized by "microtheories" that are self-consistent).  The residual question is, how much can we include in a single self-consistent theory, and the answer is "a great deal".  The COSMO ontology has over 7000 categories and over 800 relations, and is logically consistent using both the Pellet and Fact++ reasoners on the OWL ontology.  This is not trivial, because the ontology has multiple "disjoint" relations and over 2000 restrictions, any one of which can cause an inconsistency if one gets sloppy adding new classes or instances.  CYC is many times more sophisticated, with years of practical application.  There are no "disjoint" relations or restrictions in the current OWL version of the DBpedia ontology, so there may be no contradictions based on those elements, but the "Pellet" reasoner in Protege still immediately bombs when invoked, so there are constructions that are in some way inconsistent.  With effort they can be tracked down and eliminated, but there are more immediate problems with the hierarchy and relations (properties) that I think should be addressed first.  Contradictions will always come when one tries to add more detail to make the meanings of the categories less ambiguous (so that, for example, proper labels in all languages that carry the true meaning can be assigned), because no person can (in one lifetime) anticipate those contradictions by doing the kind of thorough reasoning that the computer reasoners can do when it views all of the logical implications.
For a particular purpose such as DBpedia, one will ideally develop the ontology stepwise while verifying at each step that the ontology serves the intended purpose - as it seems has been done thus far.  But the development of any such ontology can be greatly accelerated, and its soundness assured, and its functionality enhanced, by relating the categories and relations to those of existing ontologies that have been shown to be logically sound.  For DBpedia, where the existing ontology is rather small in comparison to many that have been built, this is a perfectly feasible task.
I am trying to get a good grasp of the existing structure and use of the DBpedia class system and ontology, so that I can make some suggestions for improvement.  I'm not sure how long this will take.    I think this effort is very worthwhile because the Wikipedia is an almost ideal environment, both large enough and small enough to provide a test case for detailed use of the reasoning capability of a properly structured ontology.  The reasoning allows many things to be inferred that are not explicitly stated, greatly enhancing the power of the Wikipedia itself.
Pat
Patrick Cassidy
MICRA Inc.
cassidy@micra.com
908-561-3416

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata-l] Question about wikipedia categories.