Hey Markus,
Thanks for the writeup. This clarified some things, at least for me.
However, this does not mean that you have to store the value as a compound
object that contains many strings. In fact, this
strikes me as a rather
cumbersome approach that would make it harder to use the data. In SMW we
store URIs as one string. Splitting this string into parts (under the
assumption that it was a well-formed URL to start with) is quite easy, if
this is needed (SMW does this). Conclusion: the use of a datatype for IRIs
is in no way tied to the use of an impractical serialisation; reference
implementations exist.
Agreed. The IriValue implementation is based on the SMW one, and retains
this capability. Using serialize and unserialize will cause concaternation
of the parts into one string, and then split them back up to a bunch before
they are passed to the constructor.
We are currently not using this though. Instead we are using the two last
methods here:
https://github.com/wikimedia/mediawiki-extensions-DataValues/blob/eea0d0e19…
So the solution to the problem at hand seems to be either to change these
two methods to do the same as serialize and unserialize, or to simply not
use these methods in our serialization process. The former approach is the
most local and easy to implement, and given the urgency of this, the one I
suggest going with.
(Somewhat different topic, mainly directed at the WD team itself:) It is
however an indication that having these two methods in the DataValue
implementations is not the best idea to begin with. This has been clear for
some time, though in order to fix this, we effectively need to go with the
second approach and implement proper serialization infrastructure for
DataValues. That'd also fix a number of other problems and awkwardness the
current approach is causing.
Cheers
--
Jeroen De Dauw
http://www.bn2vs.com
Don't panic. Don't be evil. ~=[,,_,,]:3
--