Stas and All,
Would the coordinate precision in Wikidata and Query Service allow us now to move to the cellular and even the atomic and subatomic levels, say, for example, when querying for differences between microscopic species in a piece of earth in a municipality that might be rich in smallest life forms - say differences between nanobes https://en.wikipedia.org/wiki/Nanobe ? And would Wikidata want even further precision for knowledge generation based on what we might learn from expansion microcopy e.g. Ed Boyden's work at MIT - say in brain science - https://www.youtube.com/watch?v=bPlr31LrT0g ?
Scott
On Tue, Aug 29, 2017 at 2:13 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I would like to initiate a discussion about coordinate precision in Wikidata and Query Service. The reason is that right now we do not have any limit to precision, coordinates are basically doubles, and that allows to specify over-precise coordinates and makes it harder to compare them - both between themselves within Wikidata and with outside services.
From the precision description in [1], we would rarely need beyond third or fourth digit after the decimal point. However, we have in the database coordinates like: Point(13.366666666 41.766666666) which pretends to specify it with sub-millimeter accuracy - for an entity that describes a municipality[2]!
We do have precision on values - e.g. the above has specified precision of "arcseconds" - so it may be just a formatting issue, but even arcsecond looks somewhat over-precise for a city. And it may be a bit challenging to convert DMS precision DD precision.
But the bigger question is whether we should store over-precise coordinates in the database at all, or we should round them up on export or inside the data. The formulae that are used to calculate distances have, by obvious reasons, limited precision, and direct comparisons can't take precision into account, which may lead to such coordinates very hard to work with. Should we maybe just put a limit on how precise we put coordinates into RDF and in query service? Would four decimals after the dot be enough? According to [4] this is what commercial GPS device can provide. If not, why and which accuracy would be appropriate?
We do export precision of the coordinate as wikibase:geoPrecision[3] - and we currently have 258060 distinct values for it. This is very weird. I am not sure precision is useful in this form. Can anybody tell me any use case for this number now? If not, maybe we should change how we represent it. I'm also not sure where these come from as we only have 13 options in the UI. Bots?
[1] https://en.wikipedia.org/wiki/Decimal_degrees [2] https://www.wikidata.org/wiki/Q116746 [3] https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_ Dump_Format#Globe_coordinate [4] https://gis.stackexchange.com/questions/8650/measuring- accuracy-of-latitude-and-longitude
-- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata