Gregor Hagedorn g.m.hagedorn@gmail.com schrieb:
So, please suggest terms to use for at least these two things:
- value certainty (ideally, not using "digits", but something that
is
independent of unit and rendering)
Here we want to talk about something that the true value is with a certain probability within a given interval, something like: "2.3 +/-0.2 µm"
I am not too sure here myself. Different terms exist whether you talk about an inherent measurement error of a single individual with a single true value, or whether you speak of statistical measures or estimates.
Marco gives yet another example: "We want to specify the "limits of (possible) variation" of a value, which would be Engineering tolerance. E.g. the value of electrical resistances, capacitors, etc. are measured in Ω ± % or F ± %. We could also either use/allow/display absolute or relative values." -- In this case, it is actually not a uncertainty of the actual sample of resistors, but a design specification, i.e. the specification that resistors must be (all or only 95%?) within _at least_ these limits.
So what to do here?
List the different use cases of a value plus-minus other values?
- measurement-method limited precision range of single measurements
(e.g.small structures in light microscope, limited by resolution capability of blue light, approx. 0.2 µm)
- measurement-method limited accuracy range (or accuracy plus
precision)
- Confidence interval for mean (or other statistical parameters: mode,
variance, etc.) of the population as estimated based on a sample
- one of potentially several percentiles (incl. +- s.d.) measuring
spread, but giving no information about the probability that the true mean is between these values
- engineering design specifications that a given (unknown) fraction of
individuals must be within these limits I believe for the moment you don't want to go into certainty in the sense that a number is an estimate of a
All these different concepts have rightly so different names. There can be:
- precision +/- 0.2
- accuracy +/- 0.2
- tolerance +/- 0.2
- error margin +/- 0.2
- +/- 1 or 2 s.d. +/- 0.2
- 95% confidence interval (CI) +/- 0.2
- 10 to 90% percentile +/- 0.2
- uncertainty (of what?) +/- 0.2
(ASIDE: the +/1 2 s.d. defines roughly a 95% probability that the next value from a random sample is in the interval, the 95% CI that the true value of the mean is in that interval. These are completely different things -- for the same measurements you can report validly 100 +/- 50 for the first and 100 +-0.001 for the second. That is, with probability 95% the next randomly sampled measurement will be between 50 and 150, and with probability 95% it is known that the true mean is between 99.999 and 100.001. Semantic matters, not only the "pattern" of plus-minus a value.)
Because of the widely varying use cases listed above, I believe we need very neutral labels for the plus-minus values if the data type shall simple provide two "variables" in a generic sense, the true semantics of which are then provided by qualifier information.
I could think of something:
- lower range (lowerRange) and upper range (upperRange).
- lower/upper interval value/endpoint
but I don't very much like this because it would force people to abandon the plus/minus notation and calculate actual values.
Better may be something like:
- upwardsAbsolute
- downwardsAbsolute
- upwardsPercent
- downwardsPercent
or
- plusValueAbsolute
- minusValueAbsolute
- plusValuePercent
- minusValuePercent
as neutral terms - but I would be glad if someone comes up with other neutral terms.
However, I hope we start realizing that all of us seem to look at this primarily from only one of the use cases listed above (me included, I usually have cases with variance spread or CI of mean). We should stop using terms that are specific to one but not the other of the cases. The assumption "these things are all more or less the same" is not true. A confidence interval is neither a manufacturing tolerance nor a measurement precision. And precision is not accuracy, etc.
- output exactness (here, the number of digits is actually what we
want to talk
about)
xsd:totalDigits or Wikipedia: significantDigits or significantFigures
that is one way to express value exactness, albeit a course on.
Marco writes: "Everywhere in the realm of software development precision is used for this. Therefore also here the suggestion of precision was not that bad."
-> In software development, the term is about the precision of the numeric data type, i.e. the precision of the storage mechanism. The term precision is correctly applied here. However, we talk about the actually significant digits of a measurement, which are part of the potential information on precision and accuracy of the value. The measured value with e.g. 6 digits may be stored in a data type which has a precision of 16 digits. I think applying "precision" to significant digits is and produces a fundamental misunderstand of what precision is, see the Wikipedia topic on precision and accuracy.
Hm the second one is only relevant for output. Why not using the Term outputformat as a pattern just like Excel, OpenOffice, and LibreOffice do? This could include the number of digits behind the comma, the optional accuracy/whatever and the unit. This will be fine for the API, and the MW-Syntax.
Cheers
Marco