[Wikidata-l] Wikidata RDF

List overview All Threads
Download

newer

older

[Wikidata-l] Various questions

[Wikidata-l] Wikibase Error

Martynas Jusevičius

28 Oct 2014 28 Oct '14

2:46 a.m.

Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Show replies by date

Gerard Meijssen

28 Oct 28 Oct

6:48 a.m.

Hoi, Hell no. Wikidata is first and foremost a product that is actually used. It has that way from the start. Prioritising RDF over actual practical use cases is imho wrong. If anything the continuous tinkering on the format of dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only have it go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org wrote:

...

Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

John Erling Blad

8:39 a.m.

The data model is close to RDF, but not quite. Statements in items are reified statements, etc. Technically it is semantic data, where RDF is one possible representaton.

There was a decision choice to keep Mediawiki to ease reuse within the Wikimedia sites, mostly so users could reuse their knowledge, but also for devs to reuse existing infrastructure.

Some of the problems with Wd comes from the fact that the similarities isn't clear enough for the users, and possibly the devs, which have resulted in a slightly introvert community and a technical structure that is slightly more Wikipedia-centric than necessary.

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...

Hoi, Hell no. Wikidata is first and foremost a product that is actually used. It has that way from the start. Prioritising RDF over actual practical use cases is imho wrong. If anything the continuous tinkering on the format of dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only have it go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org wrote:

...
Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Martynas Jusevičius

11:31 a.m.

John, please see inline:

On Tue, Oct 28, 2014 at 8:39 AM, John Erling Blad jeblad@gmail.com wrote:

...

The data model is close to RDF, but not quite. Statements in items are reified statements, etc. Technically it is semantic data, where RDF is one possible representaton.

Well it has been shown (in the paper I referenced) that Wikidata can be modeled as RDF. And there is no reason why it couldn't be, because in RDF anyone can say anything about anything.

...

There was a decision choice to keep Mediawiki to ease reuse within the Wikimedia sites, mostly so users could reuse their knowledge, but also for devs to reuse existing infrastructure.

This is exactly the decision that I question. I think it was completely misguided. If the goal was to reuse knowledge and infrastructure, then Wikidata has failed completely, as there is more infrastructure and knowledge of RDF than there ever will be for Mediawiki, or any structured/semantic data model for that matter.

...

Some of the problems with Wd comes from the fact that the similarities isn't clear enough for the users, and possibly the devs, which have resulted in a slightly introvert community and a technical structure that is slightly more Wikipedia-centric than necessary.

Here I can only agree with you. That is not an RDF problem though.

...

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...
Hoi, Hell no. Wikidata is first and foremost a product that is actually used. It has that way from the start. Prioritising RDF over actual practical use cases is imho wrong. If anything the continuous tinkering on the format of dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only have it go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org wrote:

...
Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Martynas Jusevičius

11:26 a.m.

Gerard,

what about query functionality for example? This has been long promised but shows no real progress.

And why do you think practical cases cannot be implemented using RDF? What is the justification for ignoring the whole standard and implementation stack? What makes you think Wikidata can do better than RDF?

Martynas

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...

Hoi, Hell no. Wikidata is first and foremost a product that is actually used. It has that way from the start. Prioritising RDF over actual practical use cases is imho wrong. If anything the continuous tinkering on the format of dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only have it go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org wrote:

...
Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Gerard Meijssen

11:37 a.m.

Hoi, Query has been promised and unofficially we have it for a VERY long time.. It is called WDQ. it is used in many tools. The official query will only provide a subset of functionality for quite some time as I understand it.

Practical cases in RDF for what by whom ? Wikidata is first and foremost a vehicle to bring interwiki links to our projects. Then and only then it becomes relevant to store data about the items involved. This data may be used in info boxes and what not in our projects.. THAT is practical use to our community.

RDF may of interest to others and it may be possible to do practical things by them but that does not prioritise it. I do not think Wikidata can do better. As far as I am concerned it is the least of our problems. The reuse of data is first to happen within our projects and THAT is not so much of a technical problem at all. Thanks, GerardM

On 28 October 2014 11:26, Martynas Jusevičius martynas@graphity.org wrote:

...

Gerard,

what about query functionality for example? This has been long promised but shows no real progress.

And why do you think practical cases cannot be implemented using RDF? What is the justification for ignoring the whole standard and implementation stack? What makes you think Wikidata can do better than RDF?

Martynas

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...
Hoi, Hell no. Wikidata is first and foremost a product that is actually used.

It

...
has that way from the start. Prioritising RDF over actual practical use cases is imho wrong. If anything the continuous tinkering on the format

of

...
dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only have

it

...
go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org

wrote:

...
...
Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Martynas Jusevičius

3:18 p.m.

Gerard,

what is there practical about having a query language that 1) is not a standard and never will be 2) is not supported by any other tool or project and never will be?

I would understand this kind of reasoning coming from a hobbyist project, but not from one claiming to be a global "free linked database".

Martynas

On Tue, Oct 28, 2014 at 11:37 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...

Hoi, Query has been promised and unofficially we have it for a VERY long time.. It is called WDQ. it is used in many tools. The official query will only provide a subset of functionality for quite some time as I understand it.

Practical cases in RDF for what by whom ? Wikidata is first and foremost a vehicle to bring interwiki links to our projects. Then and only then it becomes relevant to store data about the items involved. This data may be used in info boxes and what not in our projects.. THAT is practical use to our community.

RDF may of interest to others and it may be possible to do practical things by them but that does not prioritise it. I do not think Wikidata can do better. As far as I am concerned it is the least of our problems. The reuse of data is first to happen within our projects and THAT is not so much of a technical problem at all. Thanks, GerardM

On 28 October 2014 11:26, Martynas Jusevičius martynas@graphity.org wrote:

...
Gerard,

what about query functionality for example? This has been long promised but shows no real progress.

And why do you think practical cases cannot be implemented using RDF? What is the justification for ignoring the whole standard and implementation stack? What makes you think Wikidata can do better than RDF?

Martynas

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...
Hoi, Hell no. Wikidata is first and foremost a product that is actually used. It has that way from the start. Prioritising RDF over actual practical use cases is imho wrong. If anything the continuous tinkering on the format of dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only have it go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org wrote:

...
Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Gerard Meijssen

3:26 p.m.

Hoi, I find it funny that you ask EXACTLY the right question but you get the opposite answer; I do not care for the query language I care for it being available. As a consequence millions of edits have been made. Consequently this query tool is practical. I have been told that a layer on top of it that produces RDF is feasible.

Now RDF may exist and it may be usable for you in your wide world but in the corner of this world where Wikidata is filled with data this tool is invaluable and RDF does not exist.

To be honest, yes it is important that we become free linked but not at the price of having to wait for the official sanctioned product. Not at the cost of imposed stagnation. Thanks, GerardM

On 28 October 2014 15:18, Martynas Jusevičius martynas@graphity.org wrote:

...

Gerard,

what is there practical about having a query language that 1) is not a standard and never will be 2) is not supported by any other tool or project and never will be?

I would understand this kind of reasoning coming from a hobbyist project, but not from one claiming to be a global "free linked database".

Martynas

On Tue, Oct 28, 2014 at 11:37 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...
Hoi, Query has been promised and unofficially we have it for a VERY long

time..

...
It is called WDQ. it is used in many tools. The official query will only provide a subset of functionality for quite some time as I understand it.

Practical cases in RDF for what by whom ? Wikidata is first and foremost

a

...
vehicle to bring interwiki links to our projects. Then and only then it becomes relevant to store data about the items involved. This data may be used in info boxes and what not in our projects.. THAT is practical use

to

...
our community.

RDF may of interest to others and it may be possible to do practical

things

...
by them but that does not prioritise it. I do not think Wikidata can do better. As far as I am concerned it is the least of our problems. The

reuse

...
of data is first to happen within our projects and THAT is not so much

of a

...
technical problem at all. Thanks, GerardM

On 28 October 2014 11:26, Martynas Jusevičius martynas@graphity.org

wrote:

...
...
Gerard,

what about query functionality for example? This has been long promised but shows no real progress.

And why do you think practical cases cannot be implemented using RDF? What is the justification for ignoring the whole standard and implementation stack? What makes you think Wikidata can do better than RDF?

Martynas

On Tue, Oct 28, 2014 at 6:48 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:

...
Hoi, Hell no. Wikidata is first and foremost a product that is actually

used.

...
...
...
It has that way from the start. Prioritising RDF over actual practical

use

...
...
...
cases is imho wrong. If anything the continuous tinkering on the

format

...
...
...
of dumps has mostly brought us grieve. Dumps that can no longer be read like currently for the Wikidata statistics really hurt.

So lets not spend time at this time on RDF, Lets ensure that what we have works, works well and plan carefully for a better RDF but lets only

have

...
...
...
it go in production AFTER we know that it works well. Thanks, GerardM

On 28 October 2014 02:46, Martynas Jusevičius martynas@graphity.org wrote:

...
Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools,

APIs,

...
...
...
...
query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Daniel Kinzler

5:18 p.m.

Am 28.10.2014 11:26, schrieb Martynas Jusevičius:

...

And why do you think practical cases cannot be implemented using RDF? What is the justification for ignoring the whole standard and implementation stack? What makes you think Wikidata can do better than RDF?

We don't ignore the standard, but, from the start, considered RDF one way to represent the data we have in Wikidata. The fact that the RDF mapping is currently only partially implemented is annoying, but perhaps understandable, since other features (e.g. the query capability you mention) are more urgent.

However, the data model used by Wikibase/Wikidata is different from RDF on a conceptual level. In fact, it's much closer to ISO Topic Maps, if you want a standard.

The point is that in RDF, you typically have statements like "X is a Y" or "The z of X has value n". These are "facts" that can easily be used by an inference engine, and queried using SPARQL. Wikidata, however, does not contain such facts. It contains *claims*. Wikidata statements have the form "X is a Y according to a and b, in the context of C" (e.g. "The population of Berlin is 3.5 Million, according to survey 768234, in 2011"). "Deep" statements like this *can* be modeled in RDF, but the result is rather inconvenient to work with, and quite useless for SPARQL queries; just because you *can* model this email as a single number does not mean it is useful to do so - and even if it is useful for some use case, that doesn't mean it's a good idea to use that number as the primary, internal representation of this email.

Even if you ignore the "depth" of the statement, the "top level" claims (Population of Berlin = 3.5 million) are too vague, too loosely connected and too prone to errors to be useful for SPARQL based processing. DBpedia's SPARQL endpoint is pretty useless for all but trivial questions, because the data is too dirty, even though the dbpedia team does a commendable job at cleaning the data via a variety of heuristics.

Another issue is scale. Because of the depth of the statements, we would need hundreds of millions of triples to represent even the data we have today. Expect this to be billions in a years or two. Few, if any, triple stores scale that way.

-- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.

Denny Vrandečić

29 Oct 29 Oct

6:26 p.m.

Martynas,

since we had this discussion on this list previously, and again I am irked by your claim that we could just use standard RDF tools out of the box for Wikidata.

I will shut up and concede that you are right if you manage to set up a standard open source RDF tool on an open source stack that contains the Wikidata knowledge base, is keeping up to date with the rate of changes that we have, and is able to answer queries from the public without choking and dying for 24 hours, before this year is over. Announce a few days in advance on this list when you will make the experiment.

Technology has advanced by three years since we made the decision not to use standard RDF tools, so I am sure it should be much easier today. But last time I talked with people writing such tools, they were rather cautious due to our requirements.

We still wouldn't have proven that it could deal with the expected QPS Wikidata will have, but heck, I would be surprised and I would admit that I was wrong with my decision if you can do that.

Seriously, we did not snub RDF and SPARQL because we don't like it or don't know it. We decided against it *because* we know it so well and we realized it does not fulfill our requirements.

Cheers, Denny

On Mon Oct 27 2014 at 6:47:05 PM Martynas Jusevičius martynas@graphity.org wrote:

...

Hey all,

so I see there is some work being done on mapping Wikidata data model to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts modeled in it right from the start? And used standard RDF tools, APIs, query language (SPARQL) instead of building the whole thing from scratch?

Is it just me or was this decision really a colossal waste of resources?

[1] http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf

Martynas http://graphityhq.com

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Markus Krötzsch

7:34 p.m.

Martynas,

Denny is right. You could set up a Virtuoso endpoint based on our RDF exports. This would be quite nice to have. That's one important reason why we created the exports, and I really hope we will soon see this happening. We are dealing here with a very large project, and the decision for or against a technology is not just a matter of our personal preference. If RDF can demonstrate added value, then there will surely be resources to further extend the support for it. So far, we are in the lead: we provide close to one billion (!) triples Wikidata knowledge to the world. So far, there is no known use of this data. We need to go step by step: some support from us, some practical usage from the RDF community, some more support from us, ...

In reply to your initial email, Martynas, I have to say that you seem to have very little knowledge about what is going on in Wikidata. If you would follow the development reports more closely, you would know that most of the work is going into components that RDF does not replace at all. Querying with SPARQL is nice, but we are still more focussed on UI issues, history management, infrastructure integration (such as pushing changes to other sites), and many more things which are completely unrelated to RDF in every way. Your suggestion that a single file format would somehow magically make the construction of one of the world-largest community-edited knowledge bases a piece of cake is just naive.

Now don't get me wrong: naive thinking has it's place in Wikidata -- it's always naive to try what others consider impossible -- but it should be combined with some positive, forward thinking attitude. I hope that our challenge to show the power of RDF to us can unleash some positive energies in you :-) I am looking forward to your results (and happy to help if you need some more details about the RDF dumps etc.).

Best wishes,

Markus

On 29.10.2014 18:26, Denny Vrandečić wrote:

...

Martynas,

since we had this discussion on this list previously, and again I am irked by your claim that we could just use standard RDF tools out of the box for Wikidata.

I will shut up and concede that you are right if you manage to set up a standard open source RDF tool on an open source stack that contains the Wikidata knowledge base, is keeping up to date with the rate of changes that we have, and is able to answer queries from the public without choking and dying for 24 hours, before this year is over. Announce a few days in advance on this list when you will make the experiment.

Technology has advanced by three years since we made the decision not to use standard RDF tools, so I am sure it should be much easier today. But last time I talked with people writing such tools, they were rather cautious due to our requirements.

We still wouldn't have proven that it could deal with the expected QPS Wikidata will have, but heck, I would be surprised and I would admit that I was wrong with my decision if you can do that.

Seriously, we did not snub RDF and SPARQL because we don't like it or don't know it. We decided against it *because* we know it so well and we realized it does not fulfill our requirements.

Cheers, Denny

On Mon Oct 27 2014 at 6:47:05 PM Martynas Jusevičius <martynas@graphity.org mailto:martynas@graphity.org> wrote:
Hey all,

so I see there is some work being done on mapping Wikidata data model
to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts
modeled in it right from the start? And used standard RDF tools, APIs,
query language (SPARQL) instead of building the whole thing from
scratch?

Is it just me or was this decision really a colossal waste of resources?


[1] http://korrekt.org/papers/__Wikidata-RDF-export-2014.pdf
<http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf>

Martynas
http://graphityhq.com

_________________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
<https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Phillip Rhodes

7:41 p.m.

FWIW, put me in the camp of "people who want to see wikidata available via RDF" as well. I won't argue that RDF needs to be the *native* format for Wikidata, but I think it would be a crying shame for such a large knowledgebase to be cut off from seamless integration with the rest of the LinkedData world.

That said, I don't really care if RDF/SPARQL support come later and are treated as an "add on", but I do think Wikidata should at least have that as a goal for "eventually". And if I can help make that happen, I'll try to pitch in however I can. I have some experiments I'm doing now, working on some new approaches to scaling RDF triplestores, so using the Wikidata data may be an interesting testbed for that down the road.

And on a related note - and apologies if this has been discussed to death, but I haven't been on the list since the beginning - but I am curious if there is any formal collaboration (in-place|proposed|possible) between dbpedia and wikidata?

Phil

This message optimized for indexing by NSA PRISM

On Wed, Oct 29, 2014 at 2:34 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:

...

Martynas,

Denny is right. You could set up a Virtuoso endpoint based on our RDF exports. This would be quite nice to have. That's one important reason why we created the exports, and I really hope we will soon see this happening. We are dealing here with a very large project, and the decision for or against a technology is not just a matter of our personal preference. If RDF can demonstrate added value, then there will surely be resources to further extend the support for it. So far, we are in the lead: we provide close to one billion (!) triples Wikidata knowledge to the world. So far, there is no known use of this data. We need to go step by step: some support from us, some practical usage from the RDF community, some more support from us, ...

In reply to your initial email, Martynas, I have to say that you seem to have very little knowledge about what is going on in Wikidata. If you would follow the development reports more closely, you would know that most of the work is going into components that RDF does not replace at all. Querying with SPARQL is nice, but we are still more focussed on UI issues, history management, infrastructure integration (such as pushing changes to other sites), and many more things which are completely unrelated to RDF in every way. Your suggestion that a single file format would somehow magically make the construction of one of the world-largest community-edited knowledge bases a piece of cake is just naive.

Now don't get me wrong: naive thinking has it's place in Wikidata -- it's always naive to try what others consider impossible -- but it should be combined with some positive, forward thinking attitude. I hope that our challenge to show the power of RDF to us can unleash some positive energies in you :-) I am looking forward to your results (and happy to help if you need some more details about the RDF dumps etc.).

Best wishes,

Markus

On 29.10.2014 18:26, Denny Vrandečić wrote:

...
Martynas,

since we had this discussion on this list previously, and again I am irked by your claim that we could just use standard RDF tools out of the box for Wikidata.

I will shut up and concede that you are right if you manage to set up a standard open source RDF tool on an open source stack that contains the Wikidata knowledge base, is keeping up to date with the rate of changes that we have, and is able to answer queries from the public without choking and dying for 24 hours, before this year is over. Announce a few days in advance on this list when you will make the experiment.

Technology has advanced by three years since we made the decision not to use standard RDF tools, so I am sure it should be much easier today. But last time I talked with people writing such tools, they were rather cautious due to our requirements.

We still wouldn't have proven that it could deal with the expected QPS Wikidata will have, but heck, I would be surprised and I would admit that I was wrong with my decision if you can do that.

Seriously, we did not snub RDF and SPARQL because we don't like it or don't know it. We decided against it *because* we know it so well and we realized it does not fulfill our requirements.

Cheers, Denny

On Mon Oct 27 2014 at 6:47:05 PM Martynas Jusevičius <martynas@graphity.org mailto:martynas@graphity.org> wrote:
Hey all,

so I see there is some work being done on mapping Wikidata data model
to RDF [1].

Just a thought: what if you actually used RDF and Wikidata's concepts
modeled in it right from the start? And used standard RDF tools, APIs,
query language (SPARQL) instead of building the whole thing from
scratch?

Is it just me or was this decision really a colossal waste of
resources?
[1] http://korrekt.org/papers/__Wikidata-RDF-export-2014.pdf
<http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf>

Martynas
http://graphityhq.com

_________________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
<https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Lydia Pintscher

10:59 p.m.

Hey Phillip :)

On Wed, Oct 29, 2014 at 7:41 PM, Phillip Rhodes motley.crue.fan@gmail.com wrote:

...

FWIW, put me in the camp of "people who want to see wikidata available via RDF" as well. I won't argue that RDF needs to be the *native* format for Wikidata, but I think it would be a crying shame for such a large knowledgebase to be cut off from seamless integration with the rest of the LinkedData world.

That said, I don't really care if RDF/SPARQL support come later and are treated as an "add on", but I do think Wikidata should at least have that as a goal for "eventually". And if I can help make that happen, I'll try to pitch in however I can. I have some experiments I'm doing now, working on some new approaches to scaling RDF triplestores, so using the Wikidata data may be an interesting testbed for that down the road.

And on a related note - and apologies if this has been discussed to death, but I haven't been on the list since the beginning - but I am curious if there is any formal collaboration (in-place|proposed|possible) between dbpedia and wikidata?

Help with this would be awesome and totally welcome. The tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Cristian Consonni

30 Oct 30 Oct

11:49 a.m.

2014-10-29 22:59 GMT+01:00 Lydia Pintscher lydia.pintscher@wikimedia.de:

...

Help with this would be awesome and totally welcome. The tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Speaking of totally awesome (aehm :D): * see: http://wikidataldf.com * see this other thread: https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html

(If I can ask, having the RDF dumps in HDT format [again, see the other thread] would be really helpful)

Markus Krötzsch

5:34 p.m.

On 30.10.2014 11:49, Cristian Consonni wrote:

...

2014-10-29 22:59 GMT+01:00 Lydia Pintscher lydia.pintscher@wikimedia.de:

...
Help with this would be awesome and totally welcome. The tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Speaking of totally awesome (aehm :D):

see: http://wikidataldf.com

see this other thread:

https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html

(If I can ask, having the RDF dumps in HDT format [again, see the other thread] would be really helpful)

We are using OpenRDF. Can it do HDT? If yes, this would be easy to do. If no, it would be easier to use a standalone tool to transform our dumps. We could still do this. Do you have any recommendation what we could use there (i.e., a memory-efficient command-line conversion script for N3 -> HDT)?

Markus

Cristian Consonni

9:03 p.m.

2014-10-30 17:34 GMT+01:00 Markus Krötzsch markus@semantic-mediawiki.org:

...

On 30.10.2014 11:49, Cristian Consonni wrote:

...
2014-10-29 22:59 GMT+01:00 Lydia Pintscher lydia.pintscher@wikimedia.de:

...
Help with this would be awesome and totally welcome. The tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Speaking of totally awesome (aehm :D):

see: http://wikidataldf.com

see this other thread:

https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html

(If I can ask, having the RDF dumps in HDT format [again, see the other thread] would be really helpful)

We are using OpenRDF. Can it do HDT? If yes, this would be easy to do. If no, it would be easier to use a standalone tool to transform our dumps. We could still do this. Do you have any recommendation what we could use there (i.e., a memory-efficient command-line conversion script for N3 -> HDT)?

It seems that OpenRDF does not support HDT creation (see [1]). I have been using the rdf2hdt tool, obtained compiling the devel branch of the hdt-cpp library[2]. Which is developed by the group who is proposing the standard implementation to the W3C. C

[1] https://openrdf.atlassian.net/browse/SES-1874 [2] https://github.com/rdfhdt/hdt-cpp/tree/devel

Scott MacLeod

11 Nov 11 Nov

2:37 a.m.

Hi Martynas and all,

Thanks for this engaging Wikidata RDF conversation.

Wikidata RDF developments are exciting especially for eventually coding with IBM's Watson and related AI. See this related conversation, for example -

"Poster Title: Not Elementary, My Dear Watson - Extending Watson for Question Answering on Linked Open Government Data

RPI doctoral student Amar Viswanathan Kannan, working with Prof. James Hendler, presented this poster at yesterday's Cognitive Colloquium at Yorktown:

Linked Data, stored as RDF graphs lets users to traverse through heterogeneous knowledge bases with relative ease. In addition it also allows for data to be viewed from different perspectives and is also able to provide multiple conceptualizations of data. This becomes very important owing to the heterogeneous nature of the web. While traditional linked data technologies like the Simple Protocol and RDF Query Language(SPARQL) allow us to access the Linked Data knowledge bases, it requires considerable skill to design queries to access Linked Data triple stores. It is also a shift from looking at data as traditional RDBMS databases to knowledge graphs. The growing acceptance of Linked Data triple stores as general purpose knowledge bases for a variety of domains has necessitated the need for accessing such knowledge with greater ease. Enter Watson, IBMs flagship Question Answering System. It has been at the forefront of Question Answering systems for being able to answer factoid questions on the “Jeopardy!” quiz game show with pinpoint precision. The architecture of the DeepQA system, of which Watson is an application has captured the imaginations of the Artificial Intelligence Community, which has long strived to build Cognitive Systems. The DeepQA system excels at generating hypotheses and gathering evidence to refute or support these hypotheses. It also evaluates all the evidence and provides analytics."

https://www.linkedin.com/groups/Poster-Title-Not-Elementary-My-6729452.S.593...

Cheers, Scott

On Thu, Oct 30, 2014 at 1:03 PM, Cristian Consonni kikkocristian@gmail.com wrote:

...

2014-10-30 17:34 GMT+01:00 Markus Krötzsch <markus@semantic-mediawiki.org

...
: On 30.10.2014 11:49, Cristian Consonni wrote:

...
2014-10-29 22:59 GMT+01:00 Lydia Pintscher <

lydia.pintscher@wikimedia.de>:

...
...
...
Help with this would be awesome and totally welcome. The tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Speaking of totally awesome (aehm :D):

see: http://wikidataldf.com

see this other thread:

https://lists.wikimedia.org/pipermail/wikidata-l/2014-October/004920.html

...
...
(If I can ask, having the RDF dumps in HDT format [again, see the other thread] would be really helpful)

We are using OpenRDF. Can it do HDT? If yes, this would be easy to do. If no, it would be easier to use a standalone tool to transform our dumps.

We

...
could still do this. Do you have any recommendation what we could use

there

...
(i.e., a memory-efficient command-line conversion script for N3 -> HDT)?

It seems that OpenRDF does not support HDT creation (see [1]). I have been using the rdf2hdt tool, obtained compiling the devel branch of the hdt-cpp library[2]. Which is developed by the group who is proposing the standard implementation to the W3C. C

[1] https://openrdf.atlassian.net/browse/SES-1874 [2] https://github.com/rdfhdt/hdt-cpp/tree/devel

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

-- - Scott MacLeod - Founder & President - 415 480 4577 - http://worlduniversityandschool.org - World University and School - like Wikipedia with MIT OpenCourseWare (not endorsed by MIT OCW) - incorporated as a nonprofit university and school in California, and is a U.S. 501 (c) (3) tax-exempt educational organization, both effective April 2010. World University and School is sending you this because of your interest in free, online, higher education. If you don't want to receive these, please reply with 'unsubscribe' in the subject line. Thank you.

Kingsley Idehen

30 Oct 30 Oct

2:08 p.m.

On 10/29/14 5:59 PM, Lydia Pintscher wrote:

...

Hey Phillip:)

On Wed, Oct 29, 2014 at 7:41 PM, Phillip Rhodes motley.crue.fan@gmail.com wrote:

...
...
FWIW, put me in the camp of "people who want to see wikidata available via RDF" as well. I won't argue that RDF needs to be the*native* format for Wikidata, but I think it would be a crying shame for such a large knowledgebase to be cut off from seamless integration with the rest of the LinkedData world.

That said, I don't really care if RDF/SPARQL support come later and are treated as an "add on", but I do think Wikidata should at least have that as a goal for "eventually". And if I can help make that happen, I'll try to pitch in however I can. I have some experiments I'm doing now, working on some new approaches to scaling RDF triplestores, so using the Wikidata data may be an interesting testbed for that down the road.

And on a related note - and apologies if this has been discussed to death, but I haven't been on the list since the beginning - but I am curious if there is any formal collaboration (in-place|proposed|possible) between dbpedia and wikidata?

Help with this would be awesome and totally welcome. The tracking bug is athttps://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Lydia,

Linked Open Data URIs for tracking issues such as the one above:

[1] http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org... [2] http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri -- vapour report on the Linked Data URI above [3] http://linkeddata.uriburner.com/c/9BTVWIGG -- use of #this to make a Linked Open Data URI "on the fly" (no owl:sameAs reasoning and inference applied) [4] http://linkeddata.uriburner.com/c/8GUIAJ -- ditto, but with owl:sameAs reasoning and inference applied.

Since this mailing list is online, I can also add some RDF statements into this post. Basically, this turns said post (or any other such conversation) into a live Linked Open Data creation and publication mechanism, by way of nanotation [1].

## Nanotation Start ##

http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org/show_bug.cgi?id=48143 xhv:related https://twitter.com/hashtag/RDF#this ; is foaf:primaryTopic of http://linkeddata.uriburner.com/c/8GUHZ7, http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri .

## Nanotation End ##

Links:

[1] http://kidehen.blogspot.com/2014/07/nanotation.html -- Nanotation [2] http://linkeddata.uriburner.com/about/html/%7Burl-of-this-reply-once-its-liv... -- URL pattern that will show the effects (refied statements/claims amongst other things) of the nanotations above .

-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

Paul Houle

5:54 p.m.

Here's my take.

RDF standards, in themselves, don't address all of the issues needed in a data wiki. I've been thinking about the math for data wikis and it seems to me you could have a bipartite system where you have "the fact" and then "the operational metadata about the fact" and these are conceptually two different things. Then you can query against the operational metadata to project out RDF or similar facts.

Had the Wikidata people had the right idea at the time and had good luck, they might have been able to build such a system. As it is they came up with a plan they were able to execute and they did it well.

The real trouble with an RDF export from Wikidata is that if you do a complete export, you're going to get something tangled up that you can't query easily with SPARQL.

There are really two answers to this, one of them the essentially forward-chaining approach of "create a canonical data model" (it doesn't need to be the "one ring to rule them all" for the whole web, just the one you need for your job), then you can project out an RDF graph where you've expressed your opinions about all the opinions in Wikidata.

There's a backwards-chaining kind of strategy where you, effectively, try running the query multiple times with different strategies and then do data clean-up post query. That's an interesting topic too, one that again demands something beyond ordinary RDF. Since RDF and SPARQL are so well specified it also possible to extend them to do new things, such as "taint" facts with their original and propagate it to the output.

I think also people too are realizing ISO Common Logic is a superset of RDF and it is really about time that support for arity > 2 predicates comes around. Note that arity>2 is already exists in W3C specifications, in that the fundamental object in SPARQL is a "SPARQL result set" which is an arbitrary-length tuple of nodes. It is clear what should happen if you write a triple pattern like

{ ?s ?p ?o1 ?o2 ?o3 . }

This also gives a more direct mapping from SQL to SPARQL, one that would be comfortable if there was some syntactic sugar to specify fields by names.

Yes, you can fake it by writing triple patterns, but in practice people struggle to even get simple SQL-like queries to work right, and can't do the very simple things people did with production rules systems back in the 1970s.

OWL was designed on the basis of math, not on the basis of "what are the requirements for large scale data integration". Thus it lacks very basic facilities, such as numeric conversions between, say, heights, in different units.

On Thu, Oct 30, 2014 at 9:08 AM, Kingsley Idehen kidehen@openlinksw.com wrote:

...

On 10/29/14 5:59 PM, Lydia Pintscher wrote:

Hey Phillip :)

On Wed, Oct 29, 2014 at 7:41 PM, Phillip Rhodesmotley.crue.fan@gmail.com motley.crue.fan@gmail.com wrote:

...
FWIW, put me in the camp of "people who want to see wikidata available> via RDF" as well. I won't argue that RDF needs to be the **native**> format for Wikidata, but I think it would be a crying shame for such a> large knowledgebase to be cut off from seamless integration with the> rest of the LinkedData world.>> That said, I don't really care if RDF/SPARQL support come later and> are treated as an "add on", but I do think Wikidata should at least> have that as a goal for "eventually". And if I can help make that> happen, I'll try to pitch in however I can. I have some experiments> I'm doing now, working on some new approaches to scaling RDF> triplestores, so using the Wikidata data may be an interesting testbed> for that down the road.>> And on a related note - and apologies if this has been discussed to> death, but I haven't been on the list since the beginning - but I am> curious if there is any formal collaboration> (in-place|proposed|possible) between dbpedia and wikidata?

Help with this would be awesome and totally welcome. The tracking bug is at https://bugzilla.wikimedia.org/show_bug.cgi?id=48143

Lydia,

Linked Open Data URIs for tracking issues such as the one above:

[1] http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org... [2] http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri -- vapour report on the Linked Data URI above [3] http://linkeddata.uriburner.com/c/9BTVWIGG -- use of #this to make a Linked Open Data URI "on the fly" (no owl:sameAs reasoning and inference applied) [4] http://linkeddata.uriburner.com/c/8GUIAJ -- ditto, but with owl:sameAs reasoning and inference applied.

Since this mailing list is online, I can also add some RDF statements into this post. Basically, this turns said post (or any other such conversation) into a live Linked Open Data creation and publication mechanism, by way of nanotation [1].

## Nanotation Start ##

http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org/show_bug.cgi?id=48143 http://linkeddata.uriburner.com/about/id/entity/https/bugzilla.wikimedia.org/show_bug.cgi?id=48143 xhv:related https://twitter.com/hashtag/RDF#this https://twitter.com/hashtag/RDF#this ; is foaf:primaryTopic of http://linkeddata.uriburner.com/c/8GUHZ7 http://linkeddata.uriburner.com/c/8GUHZ7, http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri http://bit.ly/vapour-report-sample-wikidata-issue-tracking-entity-http-uri .

## Nanotation End ##

Links:

[1] http://kidehen.blogspot.com/2014/07/nanotation.html -- Nanotation [2] http://linkeddata.uriburner.com/about/html/%7Burl-of-this-reply-once-its-liv... -- URL pattern that will show the effects (refied statements/claims amongst other things) of the nanotations above .

-- Regards,

Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

-- Paul Houle Expert on Freebase, DBpedia, Hadoop and RDF (607) 539 6254 paul.houle on Skype ontology2@gmail.com http://legalentityidentifier.info/lei/lookup

Markus Krötzsch

9:42 a.m.

Hi Phillip,

Are you aware of the Wikidata RDF exports at http://tools.wmflabs.org/wikidata-exports/rdf/ ? Do they meet your requirements for now or do you need something different? If you have specific plans for the RDF, I would be curious to learn about them.

Cheers,

Markus

On 29.10.2014 19:41, Phillip Rhodes wrote:

...

FWIW, put me in the camp of "people who want to see wikidata available via RDF" as well. I won't argue that RDF needs to be the *native* format for Wikidata, but I think it would be a crying shame for such a large knowledgebase to be cut off from seamless integration with the rest of the LinkedData world.

That said, I don't really care if RDF/SPARQL support come later and are treated as an "add on", but I do think Wikidata should at least have that as a goal for "eventually". And if I can help make that happen, I'll try to pitch in however I can. I have some experiments I'm doing now, working on some new approaches to scaling RDF triplestores, so using the Wikidata data may be an interesting testbed for that down the road.

And on a related note - and apologies if this has been discussed to death, but I haven't been on the list since the beginning - but I am curious if there is any formal collaboration (in-place|proposed|possible) between dbpedia and wikidata?

Phil

This message optimized for indexing by NSA PRISM

On Wed, Oct 29, 2014 at 2:34 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:

...
Martynas,

Denny is right. You could set up a Virtuoso endpoint based on our RDF exports. This would be quite nice to have. That's one important reason why we created the exports, and I really hope we will soon see this happening. We are dealing here with a very large project, and the decision for or against a technology is not just a matter of our personal preference. If RDF can demonstrate added value, then there will surely be resources to further extend the support for it. So far, we are in the lead: we provide close to one billion (!) triples Wikidata knowledge to the world. So far, there is no known use of this data. We need to go step by step: some support from us, some practical usage from the RDF community, some more support from us, ...

In reply to your initial email, Martynas, I have to say that you seem to have very little knowledge about what is going on in Wikidata. If you would follow the development reports more closely, you would know that most of the work is going into components that RDF does not replace at all. Querying with SPARQL is nice, but we are still more focussed on UI issues, history management, infrastructure integration (such as pushing changes to other sites), and many more things which are completely unrelated to RDF in every way. Your suggestion that a single file format would somehow magically make the construction of one of the world-largest community-edited knowledge bases a piece of cake is just naive.

Now don't get me wrong: naive thinking has it's place in Wikidata -- it's always naive to try what others consider impossible -- but it should be combined with some positive, forward thinking attitude. I hope that our challenge to show the power of RDF to us can unleash some positive energies in you :-) I am looking forward to your results (and happy to help if you need some more details about the RDF dumps etc.).

Best wishes,

Markus

On 29.10.2014 18:26, Denny Vrandečić wrote:

...
Martynas,

since we had this discussion on this list previously, and again I am irked by your claim that we could just use standard RDF tools out of the box for Wikidata.

I will shut up and concede that you are right if you manage to set up a standard open source RDF tool on an open source stack that contains the Wikidata knowledge base, is keeping up to date with the rate of changes that we have, and is able to answer queries from the public without choking and dying for 24 hours, before this year is over. Announce a few days in advance on this list when you will make the experiment.

Technology has advanced by three years since we made the decision not to use standard RDF tools, so I am sure it should be much easier today. But last time I talked with people writing such tools, they were rather cautious due to our requirements.

We still wouldn't have proven that it could deal with the expected QPS Wikidata will have, but heck, I would be surprised and I would admit that I was wrong with my decision if you can do that.

Seriously, we did not snub RDF and SPARQL because we don't like it or don't know it. We decided against it *because* we know it so well and we realized it does not fulfill our requirements.

Cheers, Denny

On Mon Oct 27 2014 at 6:47:05 PM Martynas Jusevičius <martynas@graphity.org mailto:martynas@graphity.org> wrote:
 Hey all,

 so I see there is some work being done on mapping Wikidata data model
 to RDF [1].

 Just a thought: what if you actually used RDF and Wikidata's concepts
 modeled in it right from the start? And used standard RDF tools, APIs,
 query language (SPARQL) instead of building the whole thing from
 scratch?

 Is it just me or was this decision really a colossal waste of
resources?
 [1] http://korrekt.org/papers/__Wikidata-RDF-export-2014.pdf
 <http://korrekt.org/papers/Wikidata-RDF-export-2014.pdf>

 Martynas
 http://graphityhq.com

Phillip Rhodes

5:19 p.m.

On Thu, Oct 30, 2014 at 4:42 AM, Markus Krötzsch markus@semantic-mediawiki.org wrote:

...

Hi Phillip,

Are you aware of the Wikidata RDF exports at http://tools.wmflabs.org/wikidata-exports/rdf/ ? Do they meet your requirements for now or do you need something different? If you have specific plans for the RDF, I would be curious to learn about them.

Only in passing, I'm only just starting to really dip my toes into the Wikidata waters now. Offhand I'd say that having RDF dumps is great, depending on how frequently they are exported. Of course I'd love to see live access to the current data via SPARQL in general, but my specific use-case can be driven off exports.

Basically, I work on applying Semantic Web technology to enterprise / organizational knowledge management, using tools like Jena and Stanbol. As part of that, we do content enhancement and automatic entity linking with Stanbol. Right now we mainly use dbpedia for that, but I'm trying to figure out how data from Wikidata will play into this as well.

Phil

Magnus Manske

5:45 p.m.

Hi,

I am running the Wikidata query tool (WDQ) at http://wdq.wmflabs.org/

WDQ can run many advanced queries, but I am using my bespoke query language.

I could try to write a wrapper around it, but have not had much (aka "none") experience with SPARQL. Are there some common use case examples (even fictional ones) I could look at, or does anyone want to collaborate on a wrapper?

Cheers, Magnus

On Thu, Oct 30, 2014 at 4:19 PM, Phillip Rhodes motley.crue.fan@gmail.com wrote:

...

On Thu, Oct 30, 2014 at 4:42 AM, Markus Krötzsch markus@semantic-mediawiki.org wrote:

...
Hi Phillip,

Are you aware of the Wikidata RDF exports at http://tools.wmflabs.org/wikidata-exports/rdf/ ? Do they meet your requirements for now or do you need something different? If you have specific plans for the RDF, I would be curious to learn about them.

Only in passing, I'm only just starting to really dip my toes into the Wikidata waters now. Offhand I'd say that having RDF dumps is great, depending on how frequently they are exported. Of course I'd love to see live access to the current data via SPARQL in general, but my specific use-case can be driven off exports.

Basically, I work on applying Semantic Web technology to enterprise / organizational knowledge management, using tools like Jena and Stanbol. As part of that, we do content enhancement and automatic entity linking with Stanbol. Right now we mainly use dbpedia for that, but I'm trying to figure out how data from Wikidata will play into this as well.

Phil

Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l

3694

Age (days ago)

3708

Last active (days ago)

wikidata@lists.wikimedia.org

21 comments

13 participants

tags (0)

participants (13)

Cristian Consonni
Daniel Kinzler
Denny Vrandečić
Gerard Meijssen
John Erling Blad
Kingsley Idehen
Lydia Pintscher
Magnus Manske
Markus Krötzsch
Martynas Jusevičius
Paul Houle
Phillip Rhodes
Scott MacLeod