Re: [Wikidata] Wikidata query performance paper

6 Aug 2016

Hi!

...
  The paper was recently accepted for presentation at
the International
 Semantic Web Conference (ISWC) 2016. A pre-print is available here:

 http://aidanhogan.com/docs/wikidata-sparql-relational-graph.pdf 
Thank you for the link!
It would be interesting to see actual data representations used for RDF
(e.g. examples of the data or more detailed description). I notice that
they differ substantially from what we use in the Wikidata Query service
implementation, used with Blazegraph, and also some of the performance
features we have implemented are probably not part of your
implementation. In any case, it would be interesting to know the details
of which RDF representations were used.

I also note that only statements and qualifiers are mentioned in most of
the text, but very little mention of sitelinks and references. Were they
part of the model too?

Due to the different RDF semantics, it would be also interesting to get
more details about how the example queries were translated to the RDF
representation(s) used in the article. Was it an automatic process or
they were translated manually? Is it possible to see them?

When working on Query Service implementation, we considered a number of
possible representations, which regard to both performance and semantic
completeness. One of the conclusions was that achieving adequate
semantic completeness and performance on relational database, while
allowing people to (relatively) easy write complex queries is not
possible, due to relational engines not being a good match for
hierachical graph-like structures in Wikidata.

It would be interesting to look at the Postgres implementation of the
data model and queries to see whether your conclusions were different in
this case.

-- 
Stas Malyshev
smalyshev(a)wikimedia.org

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] Wikidata query performance paper