On 19.12.2012 08:53, Gregor Hagedorn wrote:
Displaying the numbers is another question. There I have to agree that it always makes sense to also store a typical used unit for that type of data.
I agree. What I propose is that the user interface supports entering and proofreading "10.6 nm" as "10.6" plus "n" (= nano) plus "meter".
Yes, absolutely.
How the value is stored in the data property, whether as 10.6 floating point or as 1.6e-8 is a second issue -- the latter is probably preferable.
I think neither is sufficient: we need a representation that allows for arbitrary (or at least very great) precision, and can still be indexed and compared natively by (different!) database systems. Fixed length strings can easily do that, if they are long enough. That's pretty inefficient, though.
IEEE floats work natively, but don't guarantee enough precision (well, maybe 128 bit floats come close?). The SQL, "decimal" might be sufficient: in MySQL, it allows 30 decimal digits before the decimal point, and up to 64 after. But that's still not enough to measure the extent of the universe in Plancks.
In addition to a storage option of the desired unit prefix (this may be considered a original-prefix, since naturally re-users may wish to reformat this).
I see no point in storing the unit used for input.
it is probably necessary to store the number of significant decimals.
That's how Denny proposed to calculate the default accuracy. If the accuracy is given by a complex model (e.g. a gamma distribution), then it might be handy to have a simple value that tells us the significant digits.
Hm... perhaps it's best to always express accuracy as "+/-n", and allow for more detailed information (standard deviation, whatever) as *additional* information about the accuracy (could be modelled as a qualifier internally).
I believe in the user interface this needs not be any visible setting, simply the number of digits can be preserved. Without these is impossible to store and reproduce information like "10.20 nm", it would be returned as 1.02 10^-8 m.
No, it would return using whatever system of measurement the user has selected in their preferences.
Complex heuristic may "guess" when to use the scientific SI prefixes instead. The trailing zero cannot be reproduced however when completely relying on IEEE floating-point.
We'll need heuristics to pick the correct secondary unit (e.g. nm or km). The general rule could be to pick a unit so that the actual value is between 1 and 10, with some additional rules for dealing with cultural specialities (decimeter is rarely used, hectoliter however is pretty common. The decagram is commonly used in Austria only, etc).
Note that for rendering of values in infoboxes, the desired unit and precision can always be given explicitly.
Note "precision" vs "accuracy" here: the precision controls how many digits are shown, while the accuracy indicates how exact our knowledge is. The Precision can be derived from the accuracy and vice versa, using appropriate heuristics. But they are not the same. IMHO, the accuracy should always be stored with the value, the precision never.
-- daniel