Hi,
Does anybody know the current documentation of the precision of the globe coordinate datatype? This precision was introduced after the original datamodel discussions.
I used to believe that it was a rough, informal indication of a precision based on an (easy-to-process but necessarily rather inaccurate) bounding box. After all, the UI only allows for a small number of predefined settings.
However, the data contains many values with precision value "null". What is this supposed to mean and how should consumers treat it? The precision of a measure cannot be "0" (and in any case, 0 is none of the supported constants in the UI).
In addition, the data also contains some seemingly arbitrary numbers as precision, such as a precision of 10.372851422071 degrees for the location of Oceania (Q538). Now given the fact that the Earth is not flat and Oceania is not a square, I fail to see the possible utility of such an extremely detailed custom precision value. My point is that the area (bounding box) that any such tolerance describes on Earth depends on the location of the center point, and is hardly capturing our exact amount of uncertainty in any strong sense. So why bother with such custom values?
At first I thought this would be an error in the data, but the UI has a special handling for it (quite literally: it shows the word "special" with the precision in parentheses). However, I cannot edit this value as a user without resetting it to one of the predefined values. This seems a huge limitation, which would require a very good reason for supporting such odd values at all. Or is the behaviour of the UI just a way of recovering from an error while avoiding to simply fix it by itself? Should we have a bot that corrects such cases to the nearest existing precision setting?
Cheers,
Markus
Hi Markus,
Markus Krötzsch schreef op 11-1-2015 om 2:15:
Hi,
Does anybody know the current documentation of the precision of the globe coordinate datatype? This precision was introduced after the original datamodel discussions.
No clue, I do know we have to do some conversions. See https://git.wikimedia.org/blob/pywikibot%2Fcore.git/HEAD/pywikibot%2F__init_... for the relevant Pywikibot code. Do the reverse on seemingly odd values and you probably end up with a nice dimension. Dimension is documented at https://www.mediawiki.org/wiki/Extension:GeoData#Glossary
Maarten
On 11.01.2015 14:53, Maarten Dammers wrote:
Hi Markus,
Markus Krötzsch schreef op 11-1-2015 om 2:15:
Hi,
Does anybody know the current documentation of the precision of the globe coordinate datatype? This precision was introduced after the original datamodel discussions.
No clue, I do know we have to do some conversions. See https://git.wikimedia.org/blob/pywikibot%2Fcore.git/HEAD/pywikibot%2F__init_... for the relevant Pywikibot code.
Aha, so Pywikibot converts from "approximate size of the object" to "approximate precision of the coordinates" (the latter must take into account how far north the point is). Are you saying that the seemingly odd precision values in Wikidata have been created in an attempt to draw a tight bounding box around an object of a given approximate size? Do you think this encoding of approximate size is a good way of handling this information?
Do the reverse on seemingly odd values and you probably end up with a nice dimension. Dimension is documented at https://www.mediawiki.org/wiki/Extension:GeoData#Glossary
Yes, this operation could surely be reversed without loss of precision if one knows the "radius" of any body on which we have coordinates. However, I am not sure if this "size of the object"-interpretation of precision is what Wikidata wants to say here in the first place. At least in the UI it looks more like a kind of "precision of measurement" or (worst case) some mixture of both.
For Wikidata Toolkit, the big question is whether we should continue to try and convert the data to something that matches what the UI supports, or whether we should give up and say "precision is just any number -- make of it what you want".
Markus
Approaching precision would be invaluable, Markus, and especially for eventual STEM research, for example, and inter-lingually.
Scott On Jan 11, 2015 6:59 AM, "Markus Krötzsch" markus@semantic-mediawiki.org wrote:
On 11.01.2015 14:53, Maarten Dammers wrote:
Hi Markus,
Markus Krötzsch schreef op 11-1-2015 om 2:15:
Hi,
Does anybody know the current documentation of the precision of the globe coordinate datatype? This precision was introduced after the original datamodel discussions.
No clue, I do know we have to do some conversions. See https://git.wikimedia.org/blob/pywikibot%2Fcore.git/ HEAD/pywikibot%2F__init__.py#L290 for the relevant Pywikibot code.
Aha, so Pywikibot converts from "approximate size of the object" to "approximate precision of the coordinates" (the latter must take into account how far north the point is). Are you saying that the seemingly odd precision values in Wikidata have been created in an attempt to draw a tight bounding box around an object of a given approximate size? Do you think this encoding of approximate size is a good way of handling this information?
Do the reverse on seemingly odd values
and you probably end up with a nice dimension. Dimension is documented at https://www.mediawiki.org/wiki/Extension:GeoData#Glossary
Yes, this operation could surely be reversed without loss of precision if one knows the "radius" of any body on which we have coordinates. However, I am not sure if this "size of the object"-interpretation of precision is what Wikidata wants to say here in the first place. At least in the UI it looks more like a kind of "precision of measurement" or (worst case) some mixture of both.
For Wikidata Toolkit, the big question is whether we should continue to try and convert the data to something that matches what the UI supports, or whether we should give up and say "precision is just any number -- make of it what you want".
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Summing up the answers I got via wikidata-tech:
* Arbitrary precision values are supported deliberately. The motivation for this is to be able to capture the precision given in external sources accurately. I think this means that in the future it should also at some point be possible to edit this data through the UI.
* Null values for precision are illegal, as are negative and zero values. The current (relatively recent) code will default to 1/3600 degree (one arcsecond) in such cases.
In response to these insights Wikidata Toolkit will be changed to support arbitrary precisions, apply the same default for invalid values, and change all of its coordinate data to floating point numbers (currently using long as fixed precision decimal numbers).
Cheers,
Markus
On 11.01.2015 02:15, Markus Krötzsch wrote:
Hi,
Does anybody know the current documentation of the precision of the globe coordinate datatype? This precision was introduced after the original datamodel discussions.
I used to believe that it was a rough, informal indication of a precision based on an (easy-to-process but necessarily rather inaccurate) bounding box. After all, the UI only allows for a small number of predefined settings.
However, the data contains many values with precision value "null". What is this supposed to mean and how should consumers treat it? The precision of a measure cannot be "0" (and in any case, 0 is none of the supported constants in the UI).
In addition, the data also contains some seemingly arbitrary numbers as precision, such as a precision of 10.372851422071 degrees for the location of Oceania (Q538). Now given the fact that the Earth is not flat and Oceania is not a square, I fail to see the possible utility of such an extremely detailed custom precision value. My point is that the area (bounding box) that any such tolerance describes on Earth depends on the location of the center point, and is hardly capturing our exact amount of uncertainty in any strong sense. So why bother with such custom values?
At first I thought this would be an error in the data, but the UI has a special handling for it (quite literally: it shows the word "special" with the precision in parentheses). However, I cannot edit this value as a user without resetting it to one of the predefined values. This seems a huge limitation, which would require a very good reason for supporting such odd values at all. Or is the behaviour of the UI just a way of recovering from an error while avoiding to simply fix it by itself? Should we have a bot that corrects such cases to the nearest existing precision setting?
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
For places where precision is required, has anyone given thought to using Geohash keys rather than lat/lon?
The benefits of geohashing here is that you get both the location and also the precision values in a single object.
As a secondary benefit, it can be used to index on a database with more ease than a standard lat/lon coordinate pair.
- Serge
Big picture it is important to recognize the importance of geodetic datums.
The market leading datum is WGS84 or GPS because it is (1) good enough for military work, and (2) GPS hardware is everywhere.
Before the space age, geodetic datums were determined by optical observations forming a network across a region. Islands far from the coast would get their own datum because the position of the island itself relative to the mainland is uncertain.
In today's globalized world we have more reasons for datums.
I often see road crews using specialized GPS equipment with large antennas. These systems establish a datum around base station(s) which can be precise to the centimetric range in the most advanced systems.
Measurements made with that kind of system will be NOT be precisely comparable with measurements made in other places, but you could dumb down the claim to "WGS84" because it would still be that good.
Then there is quality in the sense of conformance to requirements; we might muck with some coordinates to make data useful.
For instance, if you use a quality hand GPS to survey stars on the Hollywood Walk of Fame and then plot the points on Google maps, you discover two MUST requirements are missing
(i) the stars are all in the correct order, and (ii) on the correct side of the street
which from the viewpoint of a pedestrian is a lot more important than the fact that my images of Hollywood and Vine are rotated a bit relative to the Big G's.
Thus, an augmented reality mobile app for the Hall of Fame would require its own geodetic datum.
I spend more time walking in the woods than I do in L.A., and in the woods there are similar but different concerns. If you walk on a path that closely follows a creek, for instance, a GPS trace may not agree with the actual sequence of creek crossings -- something that would drive me nuts if I was using a map while hiking in the woods. You're supposed to fix topological problems like that when you upload to Open Street Maps, so that is another sense of a privileged datum.
On Tue, Jan 13, 2015 at 10:15 AM, Serge Wroclawski emacsen@gmail.com wrote:
For places where precision is required, has anyone given thought to using Geohash keys rather than lat/lon?
The benefits of geohashing here is that you get both the location and also the precision values in a single object.
As a secondary benefit, it can be used to index on a database with more ease than a standard lat/lon coordinate pair.
- Serge
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l