Am 11.03.2016 um 11:20 schrieb Markus Kroetzsch:
Maybe the community needs a bit more explanation as to why you "consciously" decide to override their judgement.
The idea is to give the community a tool to explicitly model their judgement that something is an identifier, and introduce that idea of external identifiers into the software exactly because that need was expressed by the community. Relevant use cases: linking, mapping, and UI structure.
The use of property P1921 clearly tells you what the community wants. If we want to have URIs only for some subset of properties, then we will use P1921 only on a subset. It is very easy and gives us complete control. The use of ExternalId as an additional restricting mechanism is neither helpful nor desired.
Can you given an example of something you want to map to a URI, but that is not an external identifiers? There are probably edge cases, and thinking about them and deciding on the desired semantics is a good thing, I believe.
We can decide for ourselves which properties should have URIs exported for them, without needing conscious but unprincipled development decisions to constrain us.
"unprincipled", wow. The decision followed the principle that we want to have software that is extensible and maintainable, and we want a data model that makes explicit the semantics of values. Following these principles, the declaration of what a value is dictates what you can do with it. That's the basic idea of object oriented design.
Of course, it would be possible to ditch these principles, and use the "duck typing" approach: anything that has a formatter URL could be linked, etc. But that introduces several problems:
* modeling: values can suddenly stop "being" identifiers, or become other things, based on the statements on the property definition. This can lead to inconsistencies in the way values are represented in dumps etc.
* implementation: we would either need to hard code a special case, or a mechanism to apply all kinds of behaviors (formatting, mapping, parsing, etc) based on all kinds of statements on properties. We can hard code for a few things, but a general mechanism would hardly be scalable or maintainable. We do have a solid and simple mechanism based on data types that works fine to cover the use cases for external identifiers.
* stability: if we base more and more behavior of the software on properties and statements defined by the community, the community would no longer be free to modify such properties and statements. That would break the software. We do compromise about this sometimes: Wikibase can be configured to know about a few properties and items (such as P1630). But we should be careful about it, because it takes away control from the community.
* consistency: You can't link just any kind of value based on a formatter uri. That only works for string values, and probably shouldn't be done for string values that have the "url" data type. So linking would only work for properties declared to be plain strings per their data type. Again, behavior is bound to the data type.
These principles are actually why we have data types at all. You were there when we decided for having them. If we don't care about the points above, we wouldn't need data types at all, value types would be sufficient. Everything else would be covered by "if it quacks like a duck...". That would mean a less expressive data model, and more complicated software. A lot more complicated, if you want to apply this for everything.
It would be helpful if you could share some pointers (1) to the original announcement and documentation for this restricting behaviour for URI exports (clearly, this information is vital for the ongoing discussion on property conversion),
It's a modeling tool, not a restriction. If there are things that should be mapped to URIs but for some reason shouldn't have the ExternalId type, we should look at these edge cases closely to find out what is wrong. Since clearly, if it's not an identifier of some sort, it can't sensibly have a URI, and if it is an identifier of some sort, there should be no reason not to mark it as such to the software, by making it an ExternalId.
and (2) to the discussions have lead to this design (surely you must have consulted with some RDF/SPARQL users and developers to conclude that some P1921 should be ignored).
I do not think any should be ignored. I think that properties that use P1921 should be ExternalIds. Please explain why you would not want that.
I am really curious to learn what "we" refers to in "we made a conscious decision".
Decisions about the design and implementation of the software are made by the development team ("us"), based on requirements and considerations on technical as well as the product level, which in turn is informed from community interaction, among other things.
As is often the case, solutions that have to be maintainable and scalable are not quite as nice as one-off solutions for a special case. MediaWiki is conservative about adding special case features for good reasons: it's quite complex as it is, if it had tried to cater to every special case, it would have collapsed under its own weight a long time ago.
The idea is to generalize from special cases, and implement something that will work for many more cases, even though it perhaps covers only 90% of what you could do by catering to the special case directly.
Of course, overly generic multi-option multi-purpose mechanisms should also be avoided, because they are hard to understand and hard to maintain. So a balance needs to be found.
Trying to strike that balance, in 2012 we (in this case including you, iirc) designed data types to be a simple yet sufficiently generic mechanism for associating behavior with values. So now we use it to associate behavior with values (like mapping to URLs and URIs), and I am very reluctant to introduce another mechanism for associating behavior with values.