A last note; listen to Markus, he is usually right.
Darn! 😤

On Fri, Aug 12, 2016 at 12:02 PM, John Erling Blad <jeblad@gmail.com> wrote:
Latest date for population isn't necessarily the preferred one, it can be a predicted one for a short timespan. For example Statistics Norway provide a 3 month expectation in addition to the one year stats. The one year stats should be the preferred ones, the 3 month stats are kind of expected change on last years stats.

Main problem with the 3 month stats are that they usually can't be used together with one-year stats, ie. they can't be normalized against the same base. Absolute value would seem the same, but growt rate against a one-year base would be wrong. It is a quite usual to do that error.

A lot of stats "sounds similar" but isn't similar. It is a bit awkward. Sometimes stats refer to international standards for how they should be made, in those cases they can be compared. It is often described on a page for metadata about the stats. An example is population in rural areas, which many assume is the same in all countries. It is not.

And while I'm on it; stats often describe a (possibly temporal) connection or relation between two or more (types of) subjects, and it is not something you should assign to one of the subject. If one part is a concrete instance then it makes sense to add stats about the other types to that item, like population for a municipality, but otherwise it could be wrong.

In general, setting the last added or most recent value to preferred is in general wrong.

And also, that something is not-preferred does not imply that it is deprecated. And also note the difference between deprecated and deferred.

On Thu, Aug 11, 2016 at 10:56 PM, Stas Malyshev <smalyshev@wikimedia.org> wrote:
Hi!

> I would argue that this is better done by using qualifiers (e.g. start
> data, end data).  If a statement on the population size would be set to
> preferred, but isn't monitored for quite some time, it can be difficult
> to see if the "preferred" statement is still accurate, whereas a
> qualifier would give a better indication that that stament might need an
> update.

Right now this bot:
https://www.wikidata.org/wiki/User:PreferentialBot
watches statements like "population" that have multiple values with
different time qualifiers but no current preference.

What it doesn't currently do is to verify that the preferred one refers
to the latest date. It probably shouldn't fix these cases (because there
may be valid cause why the latest is not the best, e.g. some population
estimates are more precise than others) but it can alert about it. This
can be added if needed.

--
Stas Malyshev
smalyshev@wikimedia.org

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata