Hi,
Here is the technical counterpart of the question on globe coordinates I just sent to wikidata-l:
""" Which of the following statements are most accurate given the technical roadmap of Wikibase?
(1a) Wikibase will continue to support arbitrary precision values for coordinates, and the UI will be extended so people can actually enter them. (1b) Wikibase will restrict the set of supported precision values for coordinates to those already supported in the UI. Other values are considered an error that will have to be fixed in the future.
(2a) Null values for precision are an error that should be fixed in the data. Wikibase will reject such data in the future. (2b) Null values for precision have a meaning. It is as follows (please explain): ... """
It would really be useful for third parties to know which way this is going.
Thanks,
Markus
Hi!
Which of the following statements are most accurate given the technical roadmap of Wikibase?
(1a) Wikibase will continue to support arbitrary precision values for coordinates, and the UI will be extended so people can actually enter them. (1b) Wikibase will restrict the set of supported precision values for coordinates to those already supported in the UI. Other values are considered an error that will have to be fixed in the future.
I'm not in the know about the technical roadmap, but I have a question - why would arbitrary precision be needed? I.e., 0.01''[1] is a centimeter-scale resolution. I'm not sure what coordinates can be even known with such resolution, let alone be needed for anything practical.
[1] https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Geographical_coordinates...
Thanks, -- Stas Malyshev smalyshev@wikimedia.org
Hi Stas,
On 11.01.2015 23:15, Stas Malyshev wrote: ...
I'm not in the know about the technical roadmap, but I have a question - why would arbitrary precision be needed? I.e., 0.01''[1] is a centimeter-scale resolution. I'm not sure what coordinates can be even known with such resolution, let alone be needed for anything practical.
I agree. One could probably remove some of the ultra-precise options for precision without loosing anything realistic (there is probably an intrinsic limit to the precision in coordinate systems like WGS84).
Regards,
Markus
[1] https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Geographical_coordinates...
Thanks,
Stas Malyshev smalyshev@wikimedia.org
Anybody? If the answer is "we not thought about this yet" then it would be good to know this, too. Any considerations that have led to the current implementation are of interest.
Cheers,
Markus
On 11.01.2015 17:11, Markus Krötzsch wrote:
Hi,
Here is the technical counterpart of the question on globe coordinates I just sent to wikidata-l:
""" Which of the following statements are most accurate given the technical roadmap of Wikibase?
(1a) Wikibase will continue to support arbitrary precision values for coordinates, and the UI will be extended so people can actually enter them. (1b) Wikibase will restrict the set of supported precision values for coordinates to those already supported in the UI. Other values are considered an error that will have to be fixed in the future.
(2a) Null values for precision are an error that should be fixed in the data. Wikibase will reject such data in the future. (2b) Null values for precision have a meaning. It is as follows (please explain): ... """
It would really be useful for third parties to know which way this is going.
Thanks,
Markus
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Am 12.01.2015 14:48, schrieb Markus Krötzsch:
Anybody? If the answer is "we not thought about this yet" then it would be good to know this, too. Any considerations that have led to the current implementation are of interest.
The range is limited (no extremely large or extremely small percisions), but inside that range, you can pick any number. The reason is that we need to be able to support precisions that are given in the actual sources; some sources give them as fractions of a degree, but some give them in meters or kilometers, which then must be converted to degrees depending on the location, given very "odd" numbers.
As for the range, I think the least precise we support is 1 degree, and the most precise is 10^-8 degrees, but I can very well be wrong there. I think we agreed at some point that anything beyond a meter is not useful.
I think it was Maarten Dammers who brought up the need for supporting arbitrary precision values, not just a fixed set.
On 12.01.2015 15:59, Daniel Kinzler wrote:
Am 12.01.2015 14:48, schrieb Markus Krötzsch:
Anybody? If the answer is "we not thought about this yet" then it would be good to know this, too. Any considerations that have led to the current implementation are of interest.
The range is limited (no extremely large or extremely small percisions), but inside that range, you can pick any number. The reason is that we need to be able to support precisions that are given in the actual sources; some sources give them as fractions of a degree, but some give them in meters or kilometers, which then must be converted to degrees depending on the location, given very "odd" numbers.
As for the range, I think the least precise we support is 1 degree, and the most precise is 10^-8 degrees, but I can very well be wrong there. I think we agreed at some point that anything beyond a meter is not useful.
I think it was Maarten Dammers who brought up the need for supporting arbitrary precision values, not just a fixed set.
Great, this clarifies a lot for me. The other question was what to make of null values for precision. Do they mean "no precision known" or something else?
Markus
Am 12.01.2015 15:09, schrieb Markus Krötzsch:
Great, this clarifies a lot for me. The other question was what to make of null values for precision. Do they mean "no precision known" or something else?
IIRC, "null" is a bug here. Not sure how to handle that - we don't have the original string, and we can't really guess the precision based on the float values.
Looking at GeoCoordinateFormatter, I see this:
if ( $precision <= 0 ) { $precision = 1 / 3600; }
I.e. it assumes 1 arc sec if no percision is given. Not great, but not much else we can do at this point.
Hey,
Looking at GeoCoordinateFormatter, I see this:
if ( $precision <= 0 ) { $precision = 1 / 3600; }
As additional context: this was added only last November.
Cheers
-- Jeroen De Dauw - http://www.bn2vs.com Software craftsmanship advocate Evil software architect at Wikimedia Germany ~=[,,_,,]:3
Thanks for the info. Then we will also treat null and "0" as error and assume arcsecond precision for these cases. We will not log warnings, however, since here are thousands of cases of this in the dump which would always swamp the log :-(. Would be good to have a bot fix this at some point.
Cheers,
Markus
P.S. I think I will also give up my opposition to double as a data format, and change Wikidata Toolkit to use double for all coordinates as well. With the elimination of fixed precisions, there is no longer any requirement to compare the data using "==" (which is just wrong for doubles).
On 12.01.2015 17:52, Jeroen De Dauw wrote:
Hey,
Looking at GeoCoordinateFormatter, I see this: if ( $precision <= 0 ) { $precision = 1 / 3600; }
As additional context: this was added only last November.
Cheers
-- Jeroen De Dauw - http://www.bn2vs.com Software craftsmanship advocate Evil software architect at Wikimedia Germany ~=[,,_,,]:3
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Hi Markus,
(1a) Wikibase will continue to support arbitrary precision values for
coordinates, and the UI will be extended so people can actually enter them.
(1b) Wikibase will restrict the set of supported precision values for
coordinates to those already supported in the UI. Other values are considered an error that will have to be fixed in the future.
In my opinion, possibly neither nor, with a tendency towards (a). Currently the API accepts any number (which makes sense in my opinion, how should the API provide a set of allowed precisions and why and how should it reject certain numbers?). The UI supports an auto-detection and a selection of predefined precisions, which is much easier to use. There may be an option to enter the precision as a number, if requested, but I don't think this is necessary at this point.
I recently introduced limits of 0.00000001° (8 decimal places) and 00°00'00.01" to the precision auto-detection to work around IEEE rounding issues (which happens both in- and externally). Both limits are equivalent to approximately 1 mm which "should be enough for anybody(tm)".
There are not really hard limits when using the API. What is entered is stored, which is how it should be in my opinion.
There is a hard limit of 1 in the formatters. Precisions bigger than 1 are ignored and default to 1.
Rounding errors and IEEE issues in the precision do not matter. The formatters calculate the number of significant decimal places from the precision (which is basically a type of rounding to either a fraction of a degree, minute or second smaller than the precision, depending on the output format). When parsing this formatted string the internal IEEE representation may change, but this possible "loss" is a one time thing, does not sum up and is irrelevant for the displayed string and equality checks (if they are done right).
(2a) Null values for precision are an error that should be fixed in the
data. Wikibase will reject such data in the future.
(2b) Null values for precision have a meaning. It is as follows (please
explain): ...
We currently have null values in the database. I tend to think of them as "not yet entered". I'm not sure if we should "reject" this at any point, I prefer to apply the auto-detection instead (so the answer is, again, neither nor).
this was added only last November.
There always was a fall back to 1/3600° if no precision was given, but that code was incomplete. If a coordinate with no precision made it to the database you could not see, edit and fix it. This is possible now. Instead of applying the auto-detection in the formatter (which would be possible but may be confusing and inconsistent) the output defaults to the most common DD°MM'SS" (a.k.a. 1/3600°).
There are quite a lot of edge cases. I already fixed a lot of them (and added tests to make sure they never break) and will happily add and fix more. Just tell me if you find one.
Best
Hi Thiemo,
Thanks for the background information. I agree with these default choices -- seems useful. Just two comments:
There may be an option to enter the precision as a number, if requested, but I don't think this is necessary at this point.
I think the point is simply that it is not nice to have a wiki system where some (useful) edits can not be made by normal people but only by developers.
...
We currently have null values in the database. I tend to think of them as "not yet entered". I'm not sure if we should "reject" this at any point, I prefer to apply the auto-detection instead (so the answer is, again, neither nor).
The problem is that auto-detection cannot be applied to the JSON data, because the values there are double numbers. Auto-detection would require the user input (decimal number string) to work. Therefore, we now default to 1.0/3600 when seeing "null" precisions in the data.
Cheers,
Markus
wikidata-tech@lists.wikimedia.org