On 08/05/2016 08:57 AM, Daniel Kinzler wrote:
Am 05.08.2016 um 17:34 schrieb Peter F. Patel-Schneider:
So some additions are breaking changes then. What is a system that consumes this information supposed to do? If the system doesn't monitor announcements then it has to assume that any new field can be a breaking change and thus should not accept data that has any new fields.
The only way to avoid breakage is to monitor announcements. The format is not final, so changes can happen (not just additions, but also removals), and then things will break if they are unaware. We tend to be careful and conservative, and announce any breaking changes in advance, but do not guarantee full backwards compatibility forever.
The only alternative is a fully versioned interface, which we don't currently have for JSON, though it has been proposed, see https://phabricator.wikimedia.org/T92961.
I assume that you are referring to the common practice of adding extra fields in HTTP and email transport and header structures under the assumption that these extra fields will just be passed on to downstream systems and then silently ignored when content is displayed.
Indeed.
I view these as special cases where there is at least an implicit contract that no additional field will change the meaning of the existing fields and data.
In the name of the Robustness Principle, I would consider this the normal case, not the exception.
When such contracts are in place systems can indeed expect to see additional fields, and are permitted to ignore these extra fields.
Does this count? https://mail-archive.com/wikidata-tech@lists.wikimedia.org/msg00902.html
This email message is not a contract about how the Wikidata JSON data format can change. It instead describes how consumers of that (and other) data are supposed to act. My view is that without guarantees of what sort of changes will be made to the Wikidata JSON data format, these are dangerous behaviours for its consumers.
Because XML specifically states that the order of attributes is not significant. Therefore changes to the order of XML attributes is not changing the encoding.
That's why I'm proposing to formalize the same kind of contract for us, see https://phabricator.wikimedia.org/T142084.
This contract guarantees that new fields will not change the interpretation of pre-existing ones, which is strong, but I don't see where it guarantees that the meaning of entire structures will not change, which is very weak.
Consider the rank field. This doesn't change the interpretation of existing fields. However, it changes how the entire claim is to be considered.
Here is where I disagree. As there is no contract that new fields in the Wikidata JSON dumps are not breaking, clients need to treat all new fields as potentially breaking and thus should not accept data with unknown fields.
While you are correct that there is no formal contract yet, the topic had been explicitly discussed before, in particular with Markus.
I say this for any data, except where there is a contract that such additional fields are not meaning-changing.
Quote me on it:
For wikibase serializations, additional fields are not meaning changing. Changes to the format or interpretation of fields will be announced as a breaking change.
Clients need to be prepared to encounter entity types and data types they don't know. But they should also allow additional fields in any JSON object. We guarantee that extra fields do not impact the interpretation of fields they know about - unless we have announced and documented a breaking change.
Is this the contract that is going to be put forward? At some time in the not too distant future I hope that my company will be using Wikidata information in its products. This contract is likely to problematic for development groups, who want some notion how long they have to prepare for changes that can silently break their products.
This is indeed the gist of what I want to establish as a stability policy. Please comment on https://phabricator.wikimedia.org/T142084.
I'm not sure how this could be made less problematic. Even with a fully versioned JSON interface, available data types etc are a matter of configuration. All we can do is announce such changes, and advise consumers that they can safely ignore unknown things.
You raise a valid point about due notice. What do you think would be a good notice period? Two weeks? A month?
Human-only due notice can only be a part of well-behaved software ecosystem. Software ends up being used in places separated from its initial developers, indeed from any developer. Requiring software to silently accept breaking additions means that breaking additions will usually break something even with a long notice period.
There can also be no fixed notice period. Sometimes software can be changed and re-deployed in a day or a week. Often, however, change and re-deployment can take several months. Right now I would be leery of a two-week notice period, as it is entirely possible that this would fall within a vacation period for a group.
Peter F. Patel-Schneider Nuance Communications