As Lydia announced, we are going to deploy support for two new data types soon (think of "data types" as "property types", as opposed to "value types"):
* The "math" type for formulas. This will use TeX syntax and is provided by the same extension that implements <math> for wikitext. We plan to roll this out on Feb 9th.
* The "external-id" type for references to external resources. We plan to roll this out on Feb 16th. NOTE: Many of the existing properties for external identifiers will be converted from the plain "string" data type to the new "external-id" data type, see https://www.wikidata.org/wiki/User:Addshore/Identifiers.
Both these new types will use the "string" value type. Below are two examples of Snaks that use the new data type, in JSON:
{ "snaktype": "value", "property": "P717", "datavalue": { "value": "\sin x^2 + \cos_b x ^ 2 = e^{2 \tfrac\pi{i}}", "type": "string" }, "datatype": "math" }
{ "snaktype": "value", "property": "P708", "datavalue": { "value": "BADWOLF", "type": "string" }, "datatype": "external-id" }
As you can see, the only thing that is new is the value of the "datatype" field.
Similarly, in RDF, both new data types use plain string literals for now, as you can see from the turtle snippet below:
wd:Q2209 a wikibase:Item ; wdt:P717 "\sin x^2 + \cos_b x ^ 2 = e^{2 \tfrac\pi{i}}" ; wdt:P708 "BADWOLF" .
The datatypes themselves are declared as follows:
wd:P708 a wikibase:Property ; wikibase:propertyType wikibase:ExternalId .
wd:P717 a wikibase:Property ; wikibase:propertyType wikibase:Math .
Accordingly, the URIs of the datatypes (not the types of the literals!) are: http://wikiba.se/ontology-beta#ExternalId http://wikiba.se/ontology-beta#Math
These are, for now, the only changes to the representation of Snaks. We do however consider some additional changes for the future. To avoid confusion, I'll put them below a big separator:
ANNOUNCEMENT ABOVE! -------------------------------------------------------------------------------- ROUGH PLANS BELOW!
Here are some changes concerning the math and external-id data types that we are considering or planning for the future.
* For the Math datatype, we may want to provide a type URI for the RDF string literal that indicates that the format is indeed TeX. Perhaps we could use http://purl.org/xtypes/Fragment-LaTeX.
* For the ExternalId data type, we would like to use resource URIs for external IDs (in "direct claims"), if possible. This would only work if we know the base URI for the property (provided by a statement on the property definition). For properties with no base URI set, we would still use plain string literals.
In our example above, the base URI for P708 might be https://tardis.net/allonzy/. The Turtle snippet would read:
wd:Q2209 a wikibase:Item ; wdt:P717 "\sin x^2 + \cos_b x ^ 2 = e^{2 \tfrac\pi{i}}" ^^purl:Fragment-LaTeX; wdt:P708 https://tardis.net/allonzy/BADWOLF .
However, the full representation of the statement would still use the original string literal:
wds:Q2209-24942a17-4791-a49d-6469-54e581eade55 a wikibase:Statement, wikibase:BestRank ; wikibase:rank wikibase:NormalRank ; ps:P708 "BADWOLF" .
We would also like to provide the full URI of the external resource in JSON, making us a good citizen of the web of linked data. We plan to do this using a mechanism we call "derived values", which we also plan to use for other kinds of normalization in the JSON output. The idea is to include additional data values in the JSON representation of a Snak:
{ "snaktype": "value", "property": "P708", "datavalue": { "value": "BADWOLF", "type": "string" }, "datavalue-uri": { "value": "https://tardis.net/allonzy/BADWOLF", "type": "string" }, "datatype": "external-id" }
In some cases, such as ISBNs, we would want a URL as well as a URI: { "snaktype": "value", "property": "P708", "datavalue": { "value": "3827370191", "type": "string" }, "datavalue-uri": { "value": "urn:isbn:3827370191", "type": "string" }, "datavalue-url": { "value": "https://www.wikidata.org/wiki/Special:BookSources/3827370191", "type": "string" }, "datatype": "external-id" }
The base URL would be given as a statement on the property, just like the base URI.
We plan to use the same mechanism for giving Quantities in a standard unit, providing thumbnail URLs for CommonsMedia values, etc.
On 05.02.2016 12:19, Daniel Kinzler wrote:
As Lydia announced, we are going to deploy support for two new data types soon (think of "data types" as "property types", as opposed to "value types"):
...
The datatypes themselves are declared as follows:
wd:P708 a wikibase:Property ; wikibase:propertyType wikibase:ExternalId .
wd:P717 a wikibase:Property ; wikibase:propertyType wikibase:Math .
Accordingly, the URIs of the datatypes (not the types of the literals!) are: http://wikiba.se/ontology-beta#ExternalId http://wikiba.se/ontology-beta#Math
Thanks, this is all I need to know. We will have a new release in time.
...
Here are some changes concerning the math and external-id data types that we are considering or planning for the future.
- For the Math datatype, we may want to provide a type URI for the RDF string
literal that indicates that the format is indeed TeX. Perhaps we could use http://purl.org/xtypes/Fragment-LaTeX.
+1 to this, especially if the string can actually be guaranteed to be LaTeX (not just regarding special commands, but also in general -- not sure if the current datatype does any type checking for the string).
- For the ExternalId data type, we would like to use resource URIs for external
IDs (in "direct claims"), if possible. This would only work if we know the base URI for the property (provided by a statement on the property definition). For properties with no base URI set, we would still use plain string literals.
Note that your "base URI" on Wikidata is called "URI pattern for RDF resource" (https://www.wikidata.org/wiki/Property:P1921). We are already using this in RDF exports. This is not specific to identifier properties but can be used with any string property where IRIs make sense.
In our example above, the base URI for P708 might be https://tardis.net/allonzy/. The Turtle snippet would read:
wd:Q2209 a wikibase:Item ; wdt:P717 "\sin x^2 + \cos_b x ^ 2 = e^{2 \tfrac\pi{i}}" ^^purl:Fragment-LaTeX; wdt:P708 https://tardis.net/allonzy/BADWOLF .
Going from string literals to IRIs changes the property type in incompatible ways. To keep existing queries (with filters etc.) working, it is better to add the URI as an extra triple rather than having it replace the main (string) id value. This is also important for users who want to display the data returned by a query in a way that looks like on Wikidata (you don't want to extract the string value from the IRI with string operations). This is also how it is currently implemented in the RDF exports.
However, the full representation of the statement would still use the original string literal:
wds:Q2209-24942a17-4791-a49d-6469-54e581eade55 a wikibase:Statement, wikibase:BestRank ; wikibase:rank wikibase:NormalRank ; ps:P708 "BADWOLF" .
We would also like to provide the full URI of the external resource in JSON, making us a good citizen of the web of linked data. We plan to do this using a mechanism we call "derived values", which we also plan to use for other kinds of normalization in the JSON output. The idea is to include additional data values in the JSON representation of a Snak:
{ "snaktype": "value", "property": "P708", "datavalue": { "value": "BADWOLF", "type": "string" }, "datavalue-uri": { "value": "https://tardis.net/allonzy/BADWOLF", "type": "string" }, "datatype": "external-id" }
In some cases, such as ISBNs, we would want a URL as well as a URI: { "snaktype": "value", "property": "P708", "datavalue": { "value": "3827370191", "type": "string" }, "datavalue-uri": { "value": "urn:isbn:3827370191", "type": "string" }, "datavalue-url": { "value": "https://www.wikidata.org/wiki/Special:BookSources/3827370191", "type": "string" }, "datatype": "external-id" }
The base URL would be given as a statement on the property, just like the base URI.
We plan to use the same mechanism for giving Quantities in a standard unit, providing thumbnail URLs for CommonsMedia values, etc.
I think I already commented on this in other places. Wasn't there a tracker item where the derived values were discussed? Some thing to keep in mind here is that many properties have multiple URIs and URLs associated. This is no problem in RDF, but your above encoding might not work for this case.
Markus
On Fri, Feb 5, 2016 at 1:50 PM Markus Kroetzsch < markus.kroetzsch@tu-dresden.de> wrote:
I think I already commented on this in other places. Wasn't there a tracker item where the derived values were discussed? Some thing to keep in mind here is that many properties have multiple URIs and URLs associated. This is no problem in RDF, but your above encoding might not work for this case.
The ticket is https://phabricator.wikimedia.org/T112548 and its blockers I assume.
Cheers Lydia
I noticed that the `math` data type validates the input (e.g. "/foo" is refused). Are there any limitations on what `external-id` is allowed to be?
------ André Costa GLAM Developer Wikimedia Sverige On 5 Feb 2016 17:18, "Lydia Pintscher" Lydia.Pintscher@wikimedia.de wrote:
On Fri, Feb 5, 2016 at 1:50 PM Markus Kroetzsch < markus.kroetzsch@tu-dresden.de> wrote:
I think I already commented on this in other places. Wasn't there a tracker item where the derived values were discussed? Some thing to keep in mind here is that many properties have multiple URIs and URLs associated. This is no problem in RDF, but your above encoding might not work for this case.
The ticket is https://phabricator.wikimedia.org/T112548 and its blockers I assume.
Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata
On Fri, Feb 5, 2016 at 7:32 PM André Costa lokal.profil@gmail.com wrote:
I noticed that the `math` data type validates the input (e.g. "/foo" is refused). Are there any limitations on what `external-id` is allowed to be?
We did not add any new restrictions for it, no. We just keep the limitations we have for strings.
Cheers Lydia