Hello Sebastian,

thanks for the pointers, this is indeed a very interesting project that tie in neatly with what we want to achieve in Wikidata. I didn't read your paper in detail, but merely skimmed it, so I hope I have a reasonable understanding of your proposal.

We have not yet written down the use cases, and we are aware that this still needs to happen. I do think that your approach - especially with the context-hash-based URIs - is very promising.

We aim to start with a simple website-level scope for Web references, but personally I would be very happy if we could move down to the more fine-grained level that your tools support in the first year already, down to the very sentence that supports a statement in Wikidata. If we get this far, your approach seems like a very strong contender for representing this.

Just a few questions - as you note, it is easier if we all use the same standards, and so I want to ask about the relation to other related standards:
* I understand that you dismiss IETF RFC 5147 because it is not stable enough, right?
* what is the relation to the W3C media fragment URIs? Did not find a pointer there.
* any plans of standardizing your approach?

We would strongly prefer to just use a standard instead of advocating contenders for one -- if one exists.


2012/5/18 Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Hello again,
maybe the question, I asked was lost, as the text was TL;DR

I heard that, it is planned to track provenance of facts. e.g. Berlin has 3,337,000 citizens found here: http://www.worldatlas.com/citypops.htm
Do you have a place where the use case and the requirements are documented for this? Or is it out of scope?
Will it be course grained, i.e. website level ? Or fine grained, i.e. text paragraph level? See e.g. how Berlin is highlighted here:
in this very early prototype.

Could you give me a link were I can read more about any Wikidata plans towards this direction?

On 05/16/2012 09:10 AM, Sebastian Hellmann wrote:
Dear all,
(Note: I could not find the document, where your requirements regarding the tracking of facts on the web are written, so I am giving a general introduction to NIF. Please send me a link to the document that specifies your need for tracing facts on the web, thanks)

I would like to point your attention to the URIs used in the NLP Interchange Format (NIF).
NIF-URIs are quite easy to use, understand and implement. NIF has a one-triple-per-annotation paradigm.  The latest documentation can be found here:

The basic idea is to use URIs with hash fragment ids to annotate or mark pages on the web:
An example is the first occurrence of "Semantic Web" on http://www.w3.org/DesignIssues/LinkedData.html  as highlighted here:

Here is a NIF example for linking a part of the document to the DBpedia entry of the Semantic Web:
     a str:StringInContext ;
     sso:oen <http://dbpedia.org/resource/Semantic_Web> .

We are currently preparing a new draft for the spec 2.0. The old one can be found here:

There are several EU projects that intend to use NIF. Furthermore, it is easier for everybody, if we standardize a Web annotation format together.
Please give feedback of your use cases.
All the best,

Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Projects: http://nlp2rdf.org , http://dbpedia.org
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Wikidata-l mailing list

Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 2 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.