I'm still talking of the model I proposed in my first post in this thread. I did give an advantage : you can really simply query the type of units used by a country to class his administrative units, like <Region>, <Departement> (as two items) for France with just simple in one request of the future simple query module : just retrieve the instances of the class <French type of administrative units>. This model also apply to any country administrative territorial division.
I think <French type of administrative units> is the "auxiliary item" markus did mention. We can define precisely what it is becaus this class <French type of administrative units> regroups the types used by france to class cities, departments ... so clearly if we talk of "Paris", there is several possibility, like the region of Paris, the City of Paris ... <The city of Paris> item is clearly the only one who is clearly defined by french law, hence it is an instance of the class <French ville>, who in turn is an instance of <French type of administrative units>.
This seems to me a useful model, who can generalise easily to class things like "Urban units", who are used for statistical purpose and are defined by national statistical organism in each country, such as INSEE in france Then we could also have a <Urban unit> class in Wikidata, but this is ambiguous.
This class could have a subclass <Urban unit as defined by INSEE>, with instances such as <Parisian urban unit>, for hich it gives statistical information. <Urban unit as defined by INSEE> in turn may be a subclass of <any geographical unit defined by INSEE>. Then if you want INSEE geographical units, you query all instances of both <Urban unit> and <any geographical unit defined by INSEE>.
But now let's say you want to find the definition of urban unit by the INSEE itself, not the instances. One way to do that would be to look at the subclass tree of <any geographical unit defined by INSEE> or the one of <Urban unit>, or compute the intersection of both trees.
One alternative, using metamodelling this time, would be to have a class regrouping all definitions of statistical units used by INSEE. Those definitions identify to some class we already have, such as <Urban unit as defined by INSEE> identifies to the definition of urban unit by INSEE. Then the class of all definitions used by INSEE would identify to the class of all classes with a name <... unit defined by INSEE>. I propose to create the item <any type of unit defined by INSEE> ; with <Urban unit as defined by INSEE> an instance of it (actually I might have already have done it /o)
This is an (non mutually exclusive, more complementary) alternative to just class administrative units instances. In a way, this is just identifying (reifying) a classification system and putting an ''instance of'' <this item> statements to its classes.
I think this is interesting in Wikidata as we actually are using a lot of different classification system, having a clear and practicable pattern like this one, using just a few tools like reasonator, is useful.
For example if I want to compute the subclass tree of statistical units, but only the classes used by INSEE, I can compute the subclass tree of the <statistical geographical unit> class but filtering only the instances of <any type of unit defined by INSEE>, instead of computing a tree intersection.
This kind of patterns occurs also for example in biological taxonomy. They use classes such as "Genera", "taxon", "clade". Imo any instance of taxon can be seen as a class of living organisms. Then Taxon is a class of class, just as <any type of unit defined by INSEE> can be seen by a class of class. <Clade> is a subclass of taxon because any clade is a taxon. So clade is also a class of class.
This makes this pattern actually already used, and studied in the world of semantic web. As it's simple and can be found in a lot of domains, I think we should use it in Wikidata.
2014-06-11 11:51 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi,
You lost me. What have classes to do with this?
The system is flexible and you assume that there are special cases... Name one and it can be fit in. Thanks, GerardM
On 11 June 2014 11:48, Thomas Douillard thomas.douillard@gmail.com wrote:
There is always special cases that needs special treatments, inference is done again and again, there is no point to recompute everything with an algorithm or a query with hard coded exceptions when we have a simple and regular system who can handle and put the exceptions in the datas.
Class of classes is such a system. I'll quote the python programming language motto here "explicit is better than implicit", which is not really different from "avoid redundances at all cost is not always a good thing".
The same reason we have classes in the first time, can also apply to classes of classes. Some class membership can be inferred by a query, I don't think it's always a bad idea to state the membership explicitely in all cases.
2014-06-11 11:21 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi,
A bot run by Amir has remedied many of the issues that resulted from an import of data for the United States. The fix was to only point to one level up and not have a reference to the state from every location. It is implicitly there.. in the final analysis we do not need to know in what country something is as it can be inferred.
This system assumes that we build the upper layers as is relevant to a specific country,. So yes it is usable for any country, type of administrative or territorial entity including how for instance the Roman Catholic church does its thing. Thanks, Gerard
On 11 June 2014 11:08, Thomas Douillard thomas.douillard@gmail.com wrote:
Hi, Gerard, I don't understand, As needed for what ? In your example it is enough to retrieve all the territorial entities a location is in.
But let's say I want to get the administrative territorial organisation of France (Wikipedias probably ), I mean like "france is divided in regions, regions are divided in departments, and so on), for example, do we have enough in your model ?
I propose to add to the classes like <French Region> for example an "instance of" claim that states <French Region> instance of <French administrative division type> to reflect that in Wikidata.
Then if I want to know how france is administratively divided, I query all the instances in that class.
This is a complement to <Pays de la Loire> instance of <French region> for example.
2014-06-11 10:37 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi,
Important to recognise is that there can be as many layers as are needed.. ie a roller coaster can be in a park, a park can be in a settlement, a settlement in a municipality, a municipality in a county, a county in a province, a province in a state and finally a state in a country (that is on a continent)...
This is how it effectively is already in Wikidata for many "locations" Thanks, Gerard
On 11 June 2014 09:48, Thomas Douillard thomas.douillard@gmail.com wrote:
Hi, I basically proposed a two layers model in extended discussions : Administrative units | Administrative unit type | Administrative unit classes by country City Of London | City of the UK | Type of administrative unit of the UK Lorraine | French Region | Type of administrative unit of France
Where going one step left in the table reads ''instance of''. This seem close to your ''helper item'' model.