In addition to a storage option of the desired unit prefix (this may be considered a original-prefix, since naturally re-users may wish to reformat this).
I see no point in storing the unit used for input.
I think you plan to store the unit (which would be meter), so you don't want to store prefixes, correct?
Please argue why you don't see a point. You want to both the size of the universe, distance to New York, size of the proton in "meter"? If not, with which algorithm will you restore the SI prefix, or rather, recognize with SI-prefix is usable? We do not use Mm in common language, so we do give the circumference of the earth as roughly 40 000 km and not as 40 Mm. We don't write 4*10^7 m either.
it is probably necessary to store the number of significant decimals.
That's how Denny proposed to calculate the default accuracy. If the accuracy is given by a complex model (e.g. a gamma distribution), then it might be handy to have a simple value that tells us the significant digits.
Hm... perhaps it's best to always express accuracy as "+/-n", and allow for more detailed information (standard deviation, whatever) as *additional* information about the accuracy (could be modelled as a qualifier internally).
I fear that is two separate levels of precision of giving a measure of measurement _precision_ (I believe "accuracy" is the wrong term here, precision and accuracy are related but distinct concepts). So 4.10 means that the last digit is significant, i.e. the best estimate is at least between 4.095 and 4.105 (but it may be better). . 4.10 +/- 0.005 means it is precisely 4.095 and 4.105, as opposed to 4.10 +/- 0.004, 4.10 +/- 0.003, 4.10 +/- 0.002 etc.
Futhermore, a quantity may be given as 4.10-4.20-4.35. The precision of measurement and the the measure of variance and dispersion are separate concepts.
I believe in the user interface this needs not be any visible setting, simply the number of digits can be preserved. Without these is impossible to store and reproduce information like "10.20 nm", it would be returned as 1.02 10^-8 m.
No, it would return using whatever system of measurement the user has selected in their preferences.
then you have lost the information. There is no "user selection" in this in science.
Complex heuristic may "guess" when to use the scientific SI prefixes instead. The trailing zero cannot be reproduced however when completely relying on IEEE floating-point.
We'll need heuristics to pick the correct secondary unit (e.g. nm or km). The
(I believe there is no such thing as a "secondary unit", did you make that term up? Only "m" is a unit of measurement, the n or k are prefixes see http://en.wikipedia.org/wiki/SI_prefix )
general rule could be to pick a unit so that the actual value is between 1 and 10, with some additional rules for dealing with cultural specialities (decimeter is rarely used, hectoliter however is pretty common. The decagram is commonly used in Austria only, etc).
You would need to also know which prefix is applicable to which unit in which context. In a scientific context different prefixes are used than in a lay context. In a lay context astronomical temperatures may be given as degree celsius, in a scientific as kelvin. This is not just a user preference.
I agree that the system should allow explicit conversion in infoboxes. I disagree that you should create an artifical intelligence system for wikidata that knows more about unit usage than the authors. To store the wisdom of authors, storing both unit and original unit prefix is necessary.
You write "The Precision can be derived from the accuracy and vice versa, using appropriate heuristics."
I _terrible strongly_ doubt that. Can you give any proof of that? For precision I can use statistics, for accuracy and need an indirect, separate and precise method to estimate accuracy. If you have a laser-distance measurement device, the precision can be estimated by yourself by repeated measurements at various times, temperatures, etc. But unless you have an objective distance standard, you have no means to determine whether the accuracy of the device is always off by 10 cm because someone screwed up the software program inside the device.
But they are not the same. IMHO, the accuracy should always be stored with the value, the precision never.
I fear that is a view of how data in a perfect world should be known, not a reflection of the kind of data that people need to store in Wikidata. Very often only the precision will be known or available to its authors, or worse, the source may not say which it is.
Gregor