Am 05.08.2016 um 15:02 schrieb Peter F. Patel-Schneider:
I side firmly with Markus here.
Consumers of data generally cannot tell whether the addition of a new field to a data encoding is a breaking change or not.
Without additional information, they cannot know, though for "mix and match" formats like JSON and XML, it's common practice to assume that ignoring additions is harmless.
In any case, we had communicated before that we do not consider the addition of a field a breaking change. It only becomes a breaking change when it impacts the interpretation of other fields. In which case we would announce it well in advance.
Given this, code that consumes encoded data should at least produce warnings when it encounters encodings that it is not expecting and preferably should refuse to produce output in such circumstances.
Depends on the circumstances. For a web browser for example, this would be very annoying behavior. Nearly all websites would be unusable. Similarly, most email would become unreadable if mail clients would be that strict.
Producers of data thus should signal in advance any changes to the encoding, even if they know that the changes can be safely ignored.
I disagree on "any". For example, do you want announcements about changes to the order of attributes in XML tags? Why? In case someone uses a regex to process the XML? Should you not be able to rely on your clients conforming the to XML spec, which says that the order of attributes is undefined?
In the case at hand (adding a field), it would have been good to communicate it in advance. But since it wasn't tagged as "breaking", it slipped through. We are sorry for that. Clients should still not choke on an addition like this.
I would view software that consumes Wikidata information and silently ignores fields that it is not expecting as deficient and would counsel against using such software.
Is this just for Wikidata, or does that extend to other kinds of data too? Why, or why not?
By definition, any extensible format or protocol (HTTP, SMTP, HTML, XML, XMPP, IRC, etc) can contain parts (headers, elements, attributes) that the client does not know about, and should ignore. Of course, the spec will tell clients where to expect and allow extra bits. That's why I'm planning to put up a document saying clearly what kinds of changes clients should be prepared to see in Wikidata output:
Clients need to be prepared to encounter entity types and data types they don't know. But they should also allow additional fields in any JSON object. We guarantee that extra fields do not impact the interpretation of fields they know about - unless we have announced and documented a breaking change.