On 05.02.2016 12:19, Daniel Kinzler wrote:
As Lydia announced, we are going to deploy support for two new data types soon (think of "data types" as "property types", as opposed to "value types"):
...
The datatypes themselves are declared as follows:
wd:P708 a wikibase:Property ; wikibase:propertyType wikibase:ExternalId .
wd:P717 a wikibase:Property ; wikibase:propertyType wikibase:Math .
Accordingly, the URIs of the datatypes (not the types of the literals!) are: http://wikiba.se/ontology-beta#ExternalId http://wikiba.se/ontology-beta#Math
Thanks, this is all I need to know. We will have a new release in time.
...
Here are some changes concerning the math and external-id data types that we are considering or planning for the future.
- For the Math datatype, we may want to provide a type URI for the RDF string
literal that indicates that the format is indeed TeX. Perhaps we could use http://purl.org/xtypes/Fragment-LaTeX.
+1 to this, especially if the string can actually be guaranteed to be LaTeX (not just regarding special commands, but also in general -- not sure if the current datatype does any type checking for the string).
- For the ExternalId data type, we would like to use resource URIs for external
IDs (in "direct claims"), if possible. This would only work if we know the base URI for the property (provided by a statement on the property definition). For properties with no base URI set, we would still use plain string literals.
Note that your "base URI" on Wikidata is called "URI pattern for RDF resource" (https://www.wikidata.org/wiki/Property:P1921). We are already using this in RDF exports. This is not specific to identifier properties but can be used with any string property where IRIs make sense.
In our example above, the base URI for P708 might be https://tardis.net/allonzy/. The Turtle snippet would read:
wd:Q2209 a wikibase:Item ; wdt:P717 "\sin x^2 + \cos_b x ^ 2 = e^{2 \tfrac\pi{i}}" ^^purl:Fragment-LaTeX; wdt:P708 https://tardis.net/allonzy/BADWOLF .
Going from string literals to IRIs changes the property type in incompatible ways. To keep existing queries (with filters etc.) working, it is better to add the URI as an extra triple rather than having it replace the main (string) id value. This is also important for users who want to display the data returned by a query in a way that looks like on Wikidata (you don't want to extract the string value from the IRI with string operations). This is also how it is currently implemented in the RDF exports.
However, the full representation of the statement would still use the original string literal:
wds:Q2209-24942a17-4791-a49d-6469-54e581eade55 a wikibase:Statement, wikibase:BestRank ; wikibase:rank wikibase:NormalRank ; ps:P708 "BADWOLF" .
We would also like to provide the full URI of the external resource in JSON, making us a good citizen of the web of linked data. We plan to do this using a mechanism we call "derived values", which we also plan to use for other kinds of normalization in the JSON output. The idea is to include additional data values in the JSON representation of a Snak:
{ "snaktype": "value", "property": "P708", "datavalue": { "value": "BADWOLF", "type": "string" }, "datavalue-uri": { "value": "https://tardis.net/allonzy/BADWOLF", "type": "string" }, "datatype": "external-id" }
In some cases, such as ISBNs, we would want a URL as well as a URI: { "snaktype": "value", "property": "P708", "datavalue": { "value": "3827370191", "type": "string" }, "datavalue-uri": { "value": "urn:isbn:3827370191", "type": "string" }, "datavalue-url": { "value": "https://www.wikidata.org/wiki/Special:BookSources/3827370191", "type": "string" }, "datatype": "external-id" }
The base URL would be given as a statement on the property, just like the base URI.
We plan to use the same mechanism for giving Quantities in a standard unit, providing thumbnail URLs for CommonsMedia values, etc.
I think I already commented on this in other places. Wasn't there a tracker item where the derived values were discussed? Some thing to keep in mind here is that many properties have multiple URIs and URLs associated. This is no problem in RDF, but your above encoding might not work for this case.
Markus