Scaling Wikidata Query Service

List overview All Threads
Download

newer

older

Language codes for Chinese

TechConf19: Nominations deadline...

Guillaume Lederrey

7 Jun 2019 7 Jun '19

3:32 a.m.

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

* scale in terms of data size * scale in terms of number of edits * have low update latency * expose a SPARQL endpoint for queries * allow anyone to run any queries on the public WDQS endpoint * provide great query performance * provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!

Guillaume

[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+2 / CEST

Show replies by date

Gerard Meijssen

7 Jun 7 Jun

4:15 a.m.

Hoi, Thank you for this answer. It helps. It helps to understand / appreciate the work that is done. Without updates like this, it becomes increasingly hard to be confident that our future will remain bright. Thanks, GerardM

On Thu, 6 Jun 2019 at 21:33, Guillaume Lederrey glederrey@wikimedia.org wrote:

...

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!

Guillaume

[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+2 / CEST

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Daniel Mietchen

4:31 a.m.

Thanks, Guillaume - this is very helpful, and it would be great to have similar information posted/ collected on other kinds of limits and potential approaches to addressing them.

Some weeks ago, we started a project to keep track of tsuch limits, and I have added pointers to your information there: https://www.wikidata.org/wiki/Wikidata:WikiProject_Limits_of_Wikidata .

If anyone is aware of similar discussions for any of the other limits, please edit that page to include pointers to those discussions.

Thanks!

Daniel

On Thu, Jun 6, 2019 at 9:33 PM Guillaume Lederrey glederrey@wikimedia.org wrote:

...

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!

Guillaume

[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+2 / CEST

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Amirouche Boubekki

8:55 p.m.

Le jeu. 6 juin 2019 à 21:33, Guillaume Lederrey glederrey@wikimedia.org a écrit :

...

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

I will add that, in an ideal world, setting up wikidata ie. the interface that allows edits and the entity search service and WDQS.

wikidata tools should be (more) accessible.

...

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore.

Reasonably, addressing all of the above constraints is unlikely to

...

ever happen.

never say never ;-)

...

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

...

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Good luck!

...

Thanks all for your patience!

Guillaume

[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy

Here is my point of view regarding some discussion happening in the talk page:

...

Giving up on SPARQL.

There is an ongoing effort to draft a 1.2 https://github.com/w3c/sparql-12 version of the SPARQL. It is the right time to give some feedback.

Also, look at https://github.com/w3c/EasierRDF/

...

JanusGraph http://janusgraph.org/ (successor of Titan, now part

DataStax) - Written in java, using scalable data-storage (cassandra/hbase) and indexing engines (ElasticSearch/SolR), queryable

That would make wikidata much less accessible. Even if JanusGraph has a Oracle Berkeley backend. The full-text search and geospatial indices are in yet-another-processus.

...

I can't think of any other way than transforming the wikidata RDF

representation to a more suitable one for graph-properties engines

FWIW, OpenCog's AtomSpace has a neo4j backend but they do not use it.

Also, graph-properties engines makes slow to represent things like:

("wikidata", "used-by", "opencog") ("wikidata", "used-by", "google")

That is, one has to create an hyper-edge if you want to be able to query those facts.

...

[2] https://phabricator.wikimedia.org/project/view/1239/

Best regards,

Amirouche ~ amz3

Amirouche Boubekki

10 Jun 10 Jun

5:18 a.m.

I made a proposal for a grant at https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB

Mind the fact that this is not about the versioned quadstore. It is about simple triplestore, it mainly missing bindings for foundationdb and SPARQL syntax.

Also, I will prolly need help to interface with geo and label services.

Feedback welcome!

Amirouche Boubekki

12 Jun 12 Jun

8:07 p.m.

Le dim. 9 juin 2019 à 23:18, Amirouche Boubekki < amirouche.boubekki@gmail.com> a écrit :

...

I made a proposal for a grant at https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB

Mind the fact that this is not about the versioned quadstore. It is about simple triplestore, it mainly missing bindings for foundationdb and SPARQL syntax.

Also, I will prolly need help to interface with geo and label services.

Feedback welcome!

I got "feedback" in others threads from the same topic that I will quote and reply to.

...

So there needs to be some smarter solution, one that we'd unlike to

develop inhouse

Big cat, small fish. As wikidata continue to grow, it will have specific needs. Needs that are unlikely to be solved by off-the-shelf solutions.

...

but one that has already been verified by industry experience and other

deployments.

FoundationDB and WiredTiger are respectively used at Apple (among other companies) and MongoDB since 3.2 all over-the-world. WiredTiger is also used at Amazon.

...

We also have a plan on improving the throughput of Blazegraph, which

we're working on now.

What is the phabricator ticket? Please.

...

"Evaluation of Metadata Representations in RDF stores"

I don't understand how this is related to the scaling issues.

...

[About proprietary version Virtuoso], I dare say [it must have] enormous

advantage for us to consider running it in production.

That will be vendor lock-in for wikidata and wikimedia along all the poor souls that try to interop with it.

...

This project seems to be still very young.

First commit https://github.com/arangodb/arangodb/commit/6577d5417a000c29c9ee7666cbcc3cefae6eee21 is from 2011.

...

AgangoDB seems to be document database inside.

It has two backends: MMAP and rocksdb.

...

While I would be very interested if somebody took on themselves to model

Wikidata

...

in terms of ArangoDB documents,

It looks like a bounty.

ArangoDB is a multi-model database, it support:

- Document - Graph - Key-Value

...

load the whole data and see what the resulting performance would be, I am

not sure

...

it would be wise for us to invest our team's - very limited currently -

resources into that.

I am biased. I would advise against trying arangodb. This is another short term solution.

...

the concept of having single data store is probably not realistic at

least

...

within foreseeable timeframes.

Incorrect. My solution is in the foreseeable future.

...

We use separate data store for search (ElasticSearch) and probably will have to have separate one for queries, whatever would be the mechanism.

It would be interesting to read how much "resource" is poured into keeping all those synchronized:

- ElasticSearch - MySQL - BlazeGraph

Maybe some REDIS?

Sebastian Hellmann

10:51 p.m.

Hi Amirouche,

On 12.06.19 14:07, Amirouche Boubekki wrote:

...

...
So there needs to be some smarter solution, one that we'd unlike to

develop inhouse

Big cat, small fish. As wikidata continue to grow, it will have specific needs. Needs that are unlikely to be solved by off-the-shelf solutions.

Are you suggesting to develop the database in-house? even MediaWiki uses MySQL

...

...
but one that has already been verified by industry experience and

other deployments.

FoundationDB and WiredTiger are respectively used at Apple (among other companies) and MongoDB since 3.2 all over-the-world. WiredTiger is also used at Amazon.

Let`s not talk about MongoDB, it is irrelevant and very mixed. Some say it is THE solution for scalability, others have said it was the biggest disappointment.

Do FoundationDB and WiredTiger have any track record for hosting open data projects or being chosen by open data projects? PostgreSQL and MySQL are widely used, e.g. OpenStreetMaps. Virtuoso by DBpedia, LODCloud cache and Uniprot.

I don know FoundationDB or WiredTiger, but in the past there were often these OS projects published by large corporations that worked in-house, but not the OS variant. Apache UIMA was one such example. Maybe Blazegraph works much better if you move to Neptune, that could be a sales hook.

Any open data projects that are running open databases with FoundationDB and WiredTiger? Where can I query them?

...

...
"Evaluation of Metadata Representations in RDF stores"

I don't understand how this is related to the scaling issues.

Not 100% pertinent, but do you have a better paper?

...

...
[About proprietary version Virtuoso], I dare say [it must have] enormous advantage for us to

consider running it in production.

That will be vendor lock-in for wikidata and wikimedia along all the poor souls that try to interop with it.

Actually Uniprot and Kingsley suggested to host the OS version. Sounded like this will hold for 5 more years, which is probably the average lifecycle. There is also SPARQL, which normally doesn`t do vendor lock-ins. Maybe you mean that nobody can rent 15 servers and install the same setup as WMF for Wikidata. That would be true. Switching always seems possible though.

-- All the best, Sebastian Hellmann Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt http://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

Amirouche Boubekki

13 Jun 13 Jun

1:27 a.m.

Hello Sebastian,

First thanks a lot for the reply. I started to believe that what I was saying was complete nonsense.

Le mer. 12 juin 2019 à 16:51, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> a écrit :

...

Hi Amirouche, On 12.06.19 14:07, Amirouche Boubekki wrote:

...
So there needs to be some smarter solution, one that we'd unlike to

develop inhouse

Big cat, small fish. As wikidata continue to grow, it will have specific needs. Needs that are unlikely to be solved by off-the-shelf solutions.

Are you suggesting to develop the database in-house?

Yes! At least part of it. The domain specific part.

...

even MediaWiki uses MySQL

Yes, but it is not because of its technical merits. Similarly for PHP. Historically, PHP and MySQL were easy to setup and easy to use but otherwise difficult to work with. This is/was painful enough that nowadays the goto RDBMS is PostgreSQL even if MySQL is still very popular [0][1]. Those are technical reasons. Also, I agree it is not because MySQL had no ACID guarantees when it started 1995, that nowadays it is a bad choice.

[0] https://trends.google.com/trends/explore?q=MySQL,PostgreSQL [1] https://stackshare.io/stackups/mysql-vs-postgresql

...

but one that has already been verified by industry experience and other deployments.

FoundationDB and WiredTiger are respectively used at Apple (among other companies) and MongoDB since 3.2 all over-the-world. WiredTiger is also used at Amazon.

Let`s not talk about MongoDB, it is irrelevant and very mixed.

I am giving an example deployment of WiredTiger. WiredTiger is an ordered Key-Value Store that is the storage engine of MongoDB since 3.2. It was created by independent company and later mongodb acquired WiredTiger. It is still GPLv2 or v3. Among the founders there is one of the engineer that created bsddb that Oracle has bought. Also, I am not saying WiredTiger solve all the problems of mongodb. I am just saying that because WiredTiger is the storage backend of MongoDB since 3.2 it has seen widespread usage and testing.

Some say it is THE solution for scalability, others have said it was the

...

biggest disappointment.

Some people gave warnings about the technical issues of mongodb before 3.2. Also, Caveat emptor. The situation is better that a few years back. After all that was open source / free software / source available software since the beginning.

Like I said above, WiredTiger is not the solution of all problems. I cited WiredTiger as a possible tool for building a cluster similar the current one where machines have full copies of the data. The advantage of WiredTiger is that it is easier to setup (compared to a distributed database) but it still requires fine-tuning / configuration. Also, there is many other Ordered Key-Value store in the wild. I have documented those in the document:

https://github.com/scheme-requests-for-implementation/srfi-167/blob/master/l...

In particular, if WDQS doesn't want to use ACID transactions, there might be a better solution. Other popular options are LMDB (used in OpenLDAP) and RocksDB by Facebook (that is LevelDB fork). But again, that is ONE possibility, my design / database work with any of the libraries described in the above libraries.md url.

My recommendation for production cluster is to use FoundationDB. Because it can scale horizontally and provides single / double / triple replication. If a node is down, the write and reads can still continue if you have enough machine up.

WiredTiger would better suited for single machine (and my database (can) support both WiredTiger and FoundationDB with the same code base).

Do FoundationDB and WiredTiger have any track record for hosting open data

...

projects or being chosen by open data projects?

tl;dr: I don't know.

Like I said previously, WiredTiger is used in many contexts among others it used at Amazon Web Services (AWS).

FoundationDB is used at Apple, I don't remember which services rely on it but at least the Data Science team rely on it. The main contributor did a lightning talk about it:

Entity Store: A FoundationDB Layer for Versioned Entities with Fine Grained https://youtu.be/16uU_Aaxp9Y

That is the use-case that looks the more like data.

More on popularity contest, it is used at WaveFront (owned by VMWare) that is an analytic tool. Here is a talk:

Running FDB at scale https://youtu.be/M438R4SlTFE

JanusGraph has FDB backend, see the talk:

The JanusGraph FoundationDB Storage Adapter https://youtu.be/rQM_ZPZy8Ck

It is also used at SnowFlake https://www.snowflake.com/ that is apparently a datawhare house, here is the talk:

How FoundationDB powers SnowflakeDB's metadata https://youtu.be/KkeyjFMmIf8

It is also used at SkuVault as multi-model database, see the forum topic:

https://forums.foundationdb.org/t/success-story-foundationdb-at-skuvault/336 https://youtu.be/KkeyjFMmIf8

Again, I think the popularity of a tool is a hint. For instance, LevelDB is very popular but it far from the best in terms of speed. Similarly, I would not recommend Oracle BerkeleyDB, even if it is owned by Oracle. That said, database configuration and fine tuning is an art. So prolly I did something wrong with bsdd and leveldb. Maybe. But at least, in my proposal it is possible to benchmark several open source vendors.

PostgreSQL and MySQL are widely used, e.g. OpenStreetMaps. Virtuoso by

...

DBpedia, LODCloud cache and Uniprot.

Yes, I know. I am waiting for a proposal to run WDQS on top of MySQL or PostgreSQL.

I don't know FoundationDB or WiredTiger, but in the past there were often

...

these OS projects published by large corporations that worked in-house,

Those are details I can not have. There is a few hints in the case of WiredTiger, in the sense that they are branches named after mongodb e.g. https://github.com/wiredtiger/wiredtiger/tree/mongodb-4.0 so it seems mongodb use a specific branch that is public.

For FoundationDB, like I said previously, setting up a cluster is more demanding as it is a distributed database. But it is also more future-proof.

...

that worked in-house,

I would like to note, that IF WDQS is hosted at openlink, the problem is the same, if not worse.

...

but not the OS variant. Apache UIMA was one such example. Maybe Blazegraph works much better if you move to Neptune, that could be a sales hook.

Any open data projects that are running open databases with FoundationDB and WiredTiger? Where can I query them?

Thanks for asking. I will set up a wiredtiger instance of wikidata. I need a few days, maybe a week (or two :)).

I could setup FoundationDB on a single machine instead but it will require more time (maybe one more week).

Also, it will not support geo-queries. I will try to make labelling work but with a custom syntax (inspired form SPARQL).

...

...
"Evaluation of Metadata Representations in RDF stores"

I don't understand how this is related to the scaling issues.

Not 100% pertinent, but do you have a better paper?

I have vaguely read the paper. On the topic of provenance, I would argue to rely a n+1 tuple items. Otherwise, off-topic but I can cite Bernstein MVCC paper which says that read operations don't block write operations and write operations don't block read operations which is what WT and FDB use internally.

...

...
[About proprietary version Virtuoso], I dare say [it must have] enormous

advantage for us to consider running it in production.

That will be vendor lock-in for wikidata and wikimedia along all the poor souls that try to interop with it.

Actually Uniprot and Kingsley suggested to host the OS version. Sounded like this will hold for 5 more years, which is probably the average lifecycle. There is also SPARQL, which normally doesn`t do vendor lock-ins. Maybe you mean that nobody can rent 15 servers and install the same setup as WMF for Wikidata. That would be true. Switching always seems possible though.

Reproducibility is a key topic in engineering. That's why there is so much As-A-Service fu around.

Delegating the infrastructure to another party would be a risk and my last choice as far as my values as a human being are concerned. That is what I mean by "portable" wikidata. This is somewhat off-topic but I think that making wikimedia infrastructure reproducible _and_ portable should be some kind of priority for the organisation. That's why I value projects like kiwix.

Again, thanks for the reply.

Amirouche Boubekki

17 Jun 17 Jun

5:01 a.m.

Hello Sebastian and Stas,

Le mer. 12 juin 2019 à 19:27, Amirouche Boubekki < amirouche.boubekki@gmail.com> a écrit :

...

Hello Sebastian,

First thanks a lot for the reply. I started to believe that what I was saying was complete nonsense.

Le mer. 12 juin 2019 à 16:51, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> a écrit :

...
Hi Amirouche,

Any open data projects that are running open databases with FoundationDB and WiredTiger? Where can I query them?

Thanks for asking. I will set up a wiredtiger instance of wikidata. I need a few days, maybe a week (or two :)).

I could setup FoundationDB on a single machine instead but it will require more time (maybe one more week).

Also, it will not support geo-queries. I will try to make labelling work but with a custom syntax (inspired form SPARQL).

I figured that anything that is not SPARQL will not be convincing. Getting my engine 100% compatible is much work.

The example deployment I have given in the previous message should be enough to convince you that FoundationDB can store WDQS.

The documented limits about FDB states that it to support up to 100TB of data https://apple.github.io/foundationdb/known-limitations.html#database-size. That is 100x times more than what WDQS needs at the moment.

Anyway, I updated my proposal to support wikimedia foundation to transition to a new solution in the wiki https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB to reflect the new requirements where the required space was reduced from 12T SSD to 6T SSD, it is based on this FDB forum topic https://forums.foundationdb.org/t/sizing-and-pricing/379/2?u=amirouche and an optimisation I will make in my engine. That proposal is biased toward getting a FDB prototype. It could be reworked to emphasize the fact that a benchmarking tool must be put together to be able to tell which solution is best.

My estimations might be off, especially the 1 month of GCP credits.

To be honest, WDQS is a low hanging fruit compared to the goal of building a portable wikidata.

I am offering my full-time services, it is up to you decide what will happen.

Sebastian Hellmann

7:34 p.m.

Hi Amirouche,

On 16.06.19 23:01, Amirouche Boubekki wrote:

...

Le mer. 12 juin 2019 à 19:27, Amirouche Boubekki <amirouche.boubekki@gmail.com mailto:amirouche.boubekki@gmail.com> a écrit :
Hello Sebastian,

First thanks a lot for the reply. I started to believe that what I
was saying was complete nonsense.

Le mer. 12 juin 2019 à 16:51, Sebastian Hellmann
<hellmann@informatik.uni-leipzig.de
<mailto:hellmann@informatik.uni-leipzig.de>> a écrit :

    Hi Amirouche,

    Any open data projects that are running open databases with
    FoundationDB and WiredTiger? Where can I query them?

Thanks for asking. I will set up a wiredtiger instance of
wikidata. I need a few days, maybe a week (or two :)).

I could setup FoundationDB on a single machine instead but it will
require more time (maybe one more week).

Also, it will not support geo-queries. I will try to make
labelling work but with a custom syntax (inspired form SPARQL).
I figured that anything that is not SPARQL will not be convincing. Getting my engine 100% compatible is much work.

The example deployment I have given in the previous message should be enough to convince you that FoundationDB can store WDQS.

Don get me wrong, I don want you to set it up. I am asking about a reference project, that has:

1. open data and an open database

2. decent amount of data

3. several years of running it.

Like OpenStreetMap and PostreSQL, MediaWiki/Wikipedia -> MySQL, DBpedia -> Virtuoso.

This would be a very good point for it. Otherwise I would consider it a sales trap, i.e. some open source which does not work really until you switch to the commercial product, same for Neptune.

Now I think, only Apple knows how to use it. Any other reference projects?

Amirouche Boubekki

10:56 p.m.

Hello Sebastian,

Le lun. 17 juin 2019 à 13:34, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> a écrit :

...

Hi Amirouche, On 16.06.19 23:01, Amirouche Boubekki wrote:

Le mer. 12 juin 2019 à 19:27, Amirouche Boubekki < amirouche.boubekki@gmail.com> a écrit :

...
Hello Sebastian,

First thanks a lot for the reply. I started to believe that what I was saying was complete nonsense.

Le mer. 12 juin 2019 à 16:51, Sebastian Hellmann < hellmann@informatik.uni-leipzig.de> a écrit :

...
Hi Amirouche,

Any open data projects that are running open databases with FoundationDB and WiredTiger? Where can I query them?

Thanks for asking. I will set up a wiredtiger instance of wikidata. I need a few days, maybe a week (or two :)).

I could setup FoundationDB on a single machine instead but it will require more time (maybe one more week).

Also, it will not support geo-queries. I will try to make labelling work but with a custom syntax (inspired form SPARQL).

I figured that anything that is not SPARQL will not be convincing. Getting my engine 100% compatible is much work.

The example deployment I have given in the previous message should be enough to convince you that FoundationDB can store WDQS.

Don get me wrong, I don want you to set it up. I am asking about a reference project, that has:

open data and an open database

decent amount of data

several years of running it.

Like OpenStreetMap and PostreSQL, MediaWiki/Wikipedia -> MySQL, DBpedia -> Virtuoso.

This would be a very good point for it. Otherwise I would consider it a sales trap,

i.e. some open source which does not work really until you switch to the

...

commercial product, same for Neptune.

Now I think, only Apple knows how to use it. Any other reference projects?

...

That is not the response that I would have liked to have. It is a far too common response, that does not echo state-of-the-art engineering practices in other industries.

Simply said: caveat emptor.

Here is a conversation about distributed database testing https://news.ycombinator.com/item?id=9417996:

...

[Someone] didn't bother running Jepsen https://jepsen.io/analyses

against FDB because foundationdb's

...

internal testing was much more rigorous that Jepsen. The foundationdb

team

...

ran it themselves and it passed with flying colors

Since we are still about nay sayers and fanboys. One of the founders of FDB is so convinced by their strategy (that boils down to testing via a simulation) is what made FDB possible, that he started another company to do so across the industry.

ref: https://www.youtube.com/watch?v=4fFDFbi3toc ref: https://www.youtube.com/watch?v=fFSPwJFXVlw

Like I said in the previous mail:

...

[The proposal] could be reworked to emphasize the fact that a

benchmarking tool

...

must be put together to be able to tell which solution is best.

The benchmarking tool must check the performance but also the correctness. I know about https://github.com/webdata/BEAR/ and https://project-hobbit.eu/ not sure whether they do all of that. And I don't know whether they can be adapted to reproduce WDQS workload.

I just figured that I would not be the good person to do that job, because I am biased toward a solution...

That is a strong requirement for anything going forward.

Have a good day,

Amirouche ~ amz3

Stas Malyshev

18 Jun 18 Jun

12:45 p.m.

Hi!

...

The documented limits about FDB states that it to support up to 100TB of data https://apple.github.io/foundationdb/known-limitations.html#database-size. That is 100x times more than what WDQS needs at the moment.

"Support" is such a multi-faceted word. It can mean "it works very well with such amount of data and is faster than the alternatives" or "it is guaranteed not to break up to this number but breaks after it" or "it would work, given massive amounts of memory and super-fast hardware and very specific set of queries, but you'd really have to take an effort to make it work" and everything in between. The devil is always in the details, which this seemingly simple word "supports" is rife with.

...

I am offering my full-time services, it is up to you decide what will happen.

I wish you luck with the grant, though I personally think if you expect to have a production-ready service in 6 month that can replace WDQS then in my personal opinion it is a bit too optimistic. I might be completely wrong on this of course. If you just plan to load the Wikidata data set and evaluate the queries to ensure they are fast and produce proper results on the setup you propose, then it can be done. Good luck!

-- Stas Malyshev smalyshev@wikimedia.org

Stas Malyshev

13 Jun 13 Jun

1:11 a.m.

Hi!

...

...
So there needs to be some smarter solution, one that we'd unlike to

develop inhouse

Big cat, small fish. As wikidata continue to grow, it will have specific needs. Needs that are unlikely to be solved by off-the-shelf solutions.

Here I think it's good place to remind that we're not Google, and developing a new database engine inhouse is probably a bit beyond our resources and budgets. Fitting existing solution to our goals - sure, but developing something new of that scale is probably not going to happen.

...

FoundationDB and WiredTiger are respectively used at Apple (among other companies) and MongoDB since 3.2 all over-the-world. WiredTiger is also used at Amazon.

I believe they are, but I think for our particular goals we have to limit themselves for a set of solution that are a proven good match for our case.

...

...
We also have a plan on improving the throughput of Blazegraph, which

we're working on now.

What is the phabricator ticket? Please.

You can see WDQS task board here: https://phabricator.wikimedia.org/tag/wikidata-query-service/

...

That will be vendor lock-in for wikidata and wikimedia along all the poor souls that try to interop with it.

Since Virtuoso is using standard SPARQL, it won't be too much of a vendor lock in, though of course the standard does not cover all, so some corners are different in all SPARQL engines. This is why even migration between SPARQL engines, even excluding operational aspects, is non-trivial. Of course, migration to any non-SPARQL engine would be order of magnitude more disruptive, so right now we do not seriously consider doing that.

...

It has two backends: MMAP and rocksdb.

Sure, but I was talking about the data model - ArangoDB sees the data as set of documents. RDF approach is a bit different.

...

ArangoDB is a multi-model database, it support:

As I already mentioned, there's a difference between "you can do it" and "you can do it efficiently". Graphs are simple creatures, and can be modeled on many backends - KV, document, relational, column store, whatever you have. The tricky part starts when you need to run millions of queries on 10B triples database. If your backend is not optimal for that task, it's not going to perform.

-- Stas Malyshev smalyshev@wikimedia.org

Amirouche Boubekki

1:55 a.m.

Le mer. 12 juin 2019 à 19:11, Stas Malyshev smalyshev@wikimedia.org a écrit :

...

Hi!

...
...
So there needs to be some smarter solution, one that we'd unlike to

develop inhouse

Big cat, small fish. As wikidata continue to grow, it will have specific needs. Needs that are unlikely to be solved by off-the-shelf solutions.

Here I think it's good place to remind that we're not Google, and developing a new database engine inhouse is probably a bit beyond our resources and budgets.

Today, the problem is not the same as the one MySQL, PostgreSQL, blazegraph and openlink had when they started working on their respective databases. See below.

...

Fitting existing solution to our goals - sure, but developing something new of

that scale is probably not going to happen.

...

It will.

...

FoundationDB and WiredTiger are respectively used at Apple (among other

...
companies) and MongoDB since 3.2 all over-the-world. WiredTiger is also used at

Amazon.

I believe they are, but I think for our particular goals we have to limit themselves for a set of solution that are a proven good match for our case.

See the other mail I just sent. We are a turning point in database engineering history. The very last database systems that were built are all based on Ordered Key Value Store, see Google Spanner paper [0].

Thanks to WT/MongoDB and Apple, those are readily available, in widespread use and fully open source. It is only missing a few pieces for making it work a fully backward compatible way with WDQS (at scale).

[0] https://ai.google/research/pubs/pub39966

...

...
That will be vendor lock-in for wikidata and wikimedia along all the poor souls that try to interop with it.

Since Virtuoso is using standard SPARQL, it won't be too much of a vendor lock in, though of course the standard does not cover all, so some corners are different in all SPARQL engines.

There is a big chance that same thing that happened with the www will happen with RDF. That is one big player own all the implementations.

...

This is why even migration between SPARQL engines, even excluding

operational aspects, is non-trivial.

I agree.

...

Of course, migration to any non-SPARQL engine would be order of magnitude

more disruptive, so right now we do not seriously consider doing that.

...

I also agree.

...

As I already mentioned, there's a difference between "you can do it" and "you can do it efficiently". [...] The tricky part starts when you need to run millions of queries on 10B triples database. If your backend is not optimal for that task, it's not going to perform.

I already did small benchmarks against blazegraph. I will do more intensive benchmarks using wikidata (and reduce the requirements in terms of SSD).

Thanks for the reply.

Kingsley Idehen

14 Jun 14 Jun

6:40 a.m.

On 6/12/19 1:11 PM, Stas Malyshev wrote:

...

...
That will be vendor lock-in for wikidata and wikimedia along all the poor souls that try to interop with it.

Since Virtuoso is using standard SPARQL, it won't be too much of a vendor lock in, though of course the standard does not cover all, so some corners are different in all SPARQL engines. This is why even migration between SPARQL engines, even excluding operational aspects, is non-trivial. Of course, migration to any non-SPARQL engine would be order of magnitude more disruptive, so right now we do not seriously consider doing that.

Hi Stas,

Yes, Virtuoso supports W3C SPARQL and ASNI SQL standards. The most important aspect of Virtuoso's design and vision boils down to using open standard on the front- and back-ends to enable maximum flexibility for its users.

There is nothing more important to us than open standards. For instance, we even extend SQL using SPARQL before entering the realm on non-standard extensions.

-- Regards, Kingsley Idehen Founder & CEO OpenLink Software Home Page: http://www.openlinksw.com Community Support: https://community.openlinksw.com Weblogs (Blogs): Company Blog: https://medium.com/openlink-software-blog Virtuoso Blog: https://medium.com/virtuoso-blog Data Access Drivers Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers Personal Weblogs (Blogs): Medium Blog: https://medium.com/@kidehen Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ http://kidehen.blogspot.com Profile Pages: Pinterest: https://www.pinterest.com/kidehen/ Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen Twitter: https://twitter.com/kidehen Google+: https://plus.google.com/+KingsleyIdehen/about LinkedIn: http://www.linkedin.com/in/kidehen Web Identities (WebID): Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

Sebastian Hellmann

10 Jun 10 Jun

10:28 p.m.

Hi Guillaume,

On 06.06.19 21:32, Guillaume Lederrey wrote:

...

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

From [1]:

...

At the moment, each WDQS cluster is a group of independent servers, sharing nothing, with each server independently updated and each server holding a full data set.

Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

Some info here:

- We evaluated some stores according to their performance: http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

- Virtuoso has proven quite useful. I don't want to advertise here, but the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

- I recently heard a presentation from Arango-DB and they had a good cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

All the best,

Sebastian

...

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!
Guillaume
[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

Guillaume Lederrey

10:54 p.m.

Hello!

On Mon, Jun 10, 2019 at 4:28 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

...

Hi Guillaume,

On 06.06.19 21:32, Guillaume Lederrey wrote:

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

Yes, I am admitting that Blazegraph (at least in the way we are using it at the moment) does not scale to our future needs. Blazegraph does have support for sharding (what they call "Scale Out"). And yes, we need to have a closer look at how that works. I'm not the expert here, so I won't even try to assert if that's a viable solution or not.

...

From [1]:

At the moment, each WDQS cluster is a group of independent servers, sharing nothing, with each server independently updated and each server holding a full data set.

Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

It all depends on your definition of a cluster. We have groups of machine collectively serving some coherent traffic, but each machine is completely independent from others. So yes, the comparison to RAID1 is adequate.

...

Some info here:

We evaluated some stores according to their performance: http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

Thanks for the link! That looks quite interesting!

...

Virtuoso has proven quite useful. I don't want to advertise here, but the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I'm not entirely sure how to read the above (and a quick look at virtuoso website does not give me the answer either), but it looks like the sharding / partitioning options are only available in the enterprise version. That probably makes it a non starter for us.

...

I recently heard a presentation from Arango-DB and they had a good cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

Nice, another one to add to our list of options to test.

...

All the best,

Sebastian

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!

Guillaume

[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- All the best, Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+2 / CEST

Sebastian Hellmann

11 Jun 11 Jun

3:03 a.m.

Hi Guillaume,

On 10.06.19 16:54, Guillaume Lederrey wrote:

...

Hello!

On Mon, Jun 10, 2019 at 4:28 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

...
Hi Guillaume,

On 06.06.19 21:32, Guillaume Lederrey wrote:

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

Yes, I am admitting that Blazegraph (at least in the way we are using it at the moment) does not scale to our future needs. Blazegraph does have support for sharding (what they call "Scale Out"). And yes, we need to have a closer look at how that works. I'm not the expert here, so I won't even try to assert if that's a viable solution or not.

Yes, sharding is what you need, I think, instead of replication. This is the technique where data is repartitioned into more manageable chunks across servers.

Here is a good explanation of it:

http://vos.openlinksw.com/owiki/wiki/VOS/VOSArticleWebScaleRDF

http://docs.openlinksw.com/virtuoso/ch-clusterprogramming/

Sharding, scale-out or repartitioning is a classical enterprise feature for Open-source databases. I am rather surprised that Blazegraph is full GPL without an enterprise edition. But then they really sounded like their goal as a company was to be bought by a bigger fish, in this case Amazon Web Services. What is their deal? They are offering support?

So if you go open-source, I think you will have a hard time finding good free databases sharding/repartition. FoundationDB as proposed in the grant [1]is from Apple

[1] https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB

I mean try the sharding feature. At some point though it might be worth considering to go enterprise. Corporate Open Source often has a twist.

Just a note here: Virtuoso is also a full RDMS, so you could probably keep wikibase db in the same cluster and fix the asynchronicity. That is also true for any mappers like Sparqlify: http://aksw.org/Projects/Sparqlify.html However, these shift the problem, then you need a sharded/repartitioned relational database....

All the best,

Sebastian

...

...
From [1]:

At the moment, each WDQS cluster is a group of independent servers, sharing nothing, with each server independently updated and each server holding a full data set.

Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

It all depends on your definition of a cluster. We have groups of machine collectively serving some coherent traffic, but each machine is completely independent from others. So yes, the comparison to RAID1 is adequate.

...
Some info here:

We evaluated some stores according to their performance: http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

Thanks for the link! That looks quite interesting!

...

Virtuoso has proven quite useful. I don't want to advertise here, but the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I'm not entirely sure how to read the above (and a quick look at virtuoso website does not give me the answer either), but it looks like the sharding / partitioning options are only available in the enterprise version. That probably makes it a non starter for us.

...

I recently heard a presentation from Arango-DB and they had a good cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

Nice, another one to add to our list of options to test.

...
All the best,

Sebastian

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!
Guillaume
[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- All the best, Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

Guillaume Lederrey

3:49 a.m.

On Mon, Jun 10, 2019 at 9:03 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

...

Hi Guillaume,

On 10.06.19 16:54, Guillaume Lederrey wrote:

Hello!

On Mon, Jun 10, 2019 at 4:28 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

Hi Guillaume,

On 06.06.19 21:32, Guillaume Lederrey wrote:

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

Yes, I am admitting that Blazegraph (at least in the way we are using it at the moment) does not scale to our future needs. Blazegraph does have support for sharding (what they call "Scale Out"). And yes, we need to have a closer look at how that works. I'm not the expert here, so I won't even try to assert if that's a viable solution or not.

Yes, sharding is what you need, I think, instead of replication. This is the technique where data is repartitioned into more manageable chunks across servers.

Well, we need sharding for scalability and replication for availability, so we do need both. The hard problem is sharding.

...

Here is a good explanation of it:

http://vos.openlinksw.com/owiki/wiki/VOS/VOSArticleWebScaleRDF

Interesting read. I don't see how Virtuoso addresses data locality, it looks like sharding of their RDF store is just hash based (I'm assuming some kind of uniform hash). I'm not enough of an expert on graph databases, but I doubt that a highly connected graph like Wikidata will be able to scale reads without some way to address data locality. Obviously, this needs testing.

...

http://docs.openlinksw.com/virtuoso/ch-clusterprogramming/

Sharding, scale-out or repartitioning is a classical enterprise feature for Open-source databases. I am rather surprised that Blazegraph is full GPL without an enterprise edition. But then they really sounded like their goal as a company was to be bought by a bigger fish, in this case Amazon Web Services. What is their deal? They are offering support?

So if you go open-source, I think you will have a hard time finding good free databases sharding/repartition. FoundationDB as proposed in the grant [1]is from Apple

[1] https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB

I mean try the sharding feature. At some point though it might be worth considering to go enterprise. Corporate Open Source often has a twist.

Closed source is not an option. We have strong open source requirements to deploy anything in our production environment.

...

Just a note here: Virtuoso is also a full RDMS, so you could probably keep wikibase db in the same cluster and fix the asynchronicity. That is also true for any mappers like Sparqlify: http://aksw.org/Projects/Sparqlify.html However, these shift the problem, then you need a sharded/repartitioned relational database....

There is no plan to move the Wikibase storage out of MySQL at the moment. In any case, having a low coupling between the primary storage for wikidata and a secondary storage for complex querying is a sound architectural principle. This asynchronous update process is most probably going to stay in place, just because it makes a lot of sense.

Thanks for the discussion so far! It is always interesting to have outside idea!

Have fun!

Guillaume

...

All the best,

Sebastian

From [1]:

At the moment, each WDQS cluster is a group of independent servers, sharing nothing, with each server independently updated and each server holding a full data set.

Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

It all depends on your definition of a cluster. We have groups of machine collectively serving some coherent traffic, but each machine is completely independent from others. So yes, the comparison to RAID1 is adequate.

Some info here:

We evaluated some stores according to their performance: http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

Thanks for the link! That looks quite interesting!

Virtuoso has proven quite useful. I don't want to advertise here, but the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I'm not entirely sure how to read the above (and a quick look at virtuoso website does not give me the answer either), but it looks like the sharding / partitioning options are only available in the enterprise version. That probably makes it a non starter for us.

I recently heard a presentation from Arango-DB and they had a good cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

Nice, another one to add to our list of options to test.

All the best,

Sebastian

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!

Guillaume

[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- All the best, Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

-- All the best, Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

-- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+2 / CEST

Jerven Bolleman

5:10 p.m.

Hi Guillaume, All,

As the lead developer for sparql.uniprot.org one of the few sparql endpoints with much more data (7x) than wikidata and significant external users. I can chime in with our experiences of hosting data with Virtuoso. All in all, I am very happy with it and it has made our endpoint possible and useful at a shoe string budget.

We like WikiData have an async loading process and allow anyone to run analytics queries on our SPARQL endpoint with generous timeouts.

We have two servers each with 256GB of ram and 8TB of raw SSD space (consumer). These are whitebox AMD machines from 2014 and the main cost at the time was the RAM. The setup was relatively cheap (cheaper than what is documented at https://www.mediawiki.org/wiki/Wikidata_Query_Service/Implementation#Hardwar...)

Even in 2014 we already had more data than you do now.

There is a third multi use server which does the loading of data offline. This is now a larger new Epyc server with more ram and more SSD but used for much more than just the RDF loading).

Unlike, most sites we do have our own custom frontend in front of virtuoso. We did this to allow more styling, as well as being flexible and change implementations at our whim. e.g. we double parse the SPARQL queries and even rewrite some to be friendlier. I suggest you do the same no matter which DB you use in the end, and we would be willing to open source ours (it is in Java, and uses RDF4J and some ugly JSPX but it works, if not to use at least as an inspiration). We did this to avoid being locked into endpoint specific features.

We use the opensource edition of virtuoso, and do not need the sharding etc. features. We use the CAIS (Cheap Array Of Independent Servers ;) approach to resilience. OpenlinkSW behind Virtuoso can deliver support for the OpenSource edition and if you are interested I suggest you talk to them.

Virtuoso 7 has become very resilient over the years, and does not need much hand-holding anymore (in 2015 this was different). Of course we have aggressive auto-restart code but this is rarely triggered these days. While the inbound queries are getting more complex.

Some of the tricks you have build into WQS are going to be a pain to redo in virtuoso. But I don't see anything impossible there.

Pragmatically, while WDS is a Graph database, the queries are actually very relational. And none of the standard graph algorithms are used. To be honest RDF is actually a relational system which means that relational techniques are very good at answering them. The sole issue is recursive queries (e.g. rdfs:subClassOf+) in which the virtuoso implementation is adequate but not great.

This is why recovering physical schemata from RDF data is such a powerful optimization technique [1]. i.e. you tend to do joins not traversals. This is not always true but I strongly suspect it will hold for the vast majority of the Wikidata Query Service case.

I hope this was helpful, and I am willing to answer further questions.

Regards, Jerven

[1] https://research.vu.nl/files/61555276/complete%20dissertation.pdf and associated work that was done by Orri Erling. Which unfortunately has not yet landed in the Virtuoso master branch.

On 6/10/19 9:49 PM, Guillaume Lederrey wrote:

...

On Mon, Jun 10, 2019 at 9:03 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

...
Hi Guillaume,

On 10.06.19 16:54, Guillaume Lederrey wrote:

Hello!

On Mon, Jun 10, 2019 at 4:28 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

Hi Guillaume,

On 06.06.19 21:32, Guillaume Lederrey wrote:

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

Yes, I am admitting that Blazegraph (at least in the way we are using it at the moment) does not scale to our future needs. Blazegraph does have support for sharding (what they call "Scale Out"). And yes, we need to have a closer look at how that works. I'm not the expert here, so I won't even try to assert if that's a viable solution or not.

Yes, sharding is what you need, I think, instead of replication. This is the technique where data is repartitioned into more manageable chunks across servers.

Well, we need sharding for scalability and replication for availability, so we do need both. The hard problem is sharding.

...
Here is a good explanation of it:

http://vos.openlinksw.com/owiki/wiki/VOS/VOSArticleWebScaleRDF

Interesting read. I don't see how Virtuoso addresses data locality, it looks like sharding of their RDF store is just hash based (I'm assuming some kind of uniform hash). I'm not enough of an expert on graph databases, but I doubt that a highly connected graph like Wikidata will be able to scale reads without some way to address data locality. Obviously, this needs testing.

...
http://docs.openlinksw.com/virtuoso/ch-clusterprogramming/

Sharding, scale-out or repartitioning is a classical enterprise feature for Open-source databases. I am rather surprised that Blazegraph is full GPL without an enterprise edition. But then they really sounded like their goal as a company was to be bought by a bigger fish, in this case Amazon Web Services. What is their deal? They are offering support?

So if you go open-source, I think you will have a hard time finding good free databases sharding/repartition. FoundationDB as proposed in the grant [1]is from Apple

[1] https://meta.wikimedia.org/wiki/Grants:Project/WDQS_On_FoundationDB

I mean try the sharding feature. At some point though it might be worth considering to go enterprise. Corporate Open Source often has a twist.

Closed source is not an option. We have strong open source requirements to deploy anything in our production environment.

...
Just a note here: Virtuoso is also a full RDMS, so you could probably keep wikibase db in the same cluster and fix the asynchronicity. That is also true for any mappers like Sparqlify: http://aksw.org/Projects/Sparqlify.html However, these shift the problem, then you need a sharded/repartitioned relational database....

There is no plan to move the Wikibase storage out of MySQL at the moment. In any case, having a low coupling between the primary storage for wikidata and a secondary storage for complex querying is a sound architectural principle. This asynchronous update process is most probably going to stay in place, just because it makes a lot of sense.

Thanks for the discussion so far! It is always interesting to have outside idea!
Have fun!

  Guillaume
...
All the best,

Sebastian

From [1]:

At the moment, each WDQS cluster is a group of independent servers, sharing nothing, with each server independently updated and each server holding a full data set.

Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

It all depends on your definition of a cluster. We have groups of machine collectively serving some coherent traffic, but each machine is completely independent from others. So yes, the comparison to RAID1 is adequate.

Some info here:

We evaluated some stores according to their performance: http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

Thanks for the link! That looks quite interesting!

Virtuoso has proven quite useful. I don't want to advertise here, but the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I'm not entirely sure how to read the above (and a quick look at virtuoso website does not give me the answer either), but it looks like the sharding / partitioning options are only available in the enterprise version. That probably makes it a non starter for us.

I recently heard a presentation from Arango-DB and they had a good cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

Nice, another one to add to our list of options to test.

All the best,

Sebastian

Reasonably, addressing all of the above constraints is unlikely to ever happen. Some of the constraints are non negotiable: if we can't keep up with Wikidata in term of data size or number of edits, it does not make sense to address query performance. On some constraints, we will probably need to compromise.

For example, the update process is asynchronous. It is by nature expected to lag. In the best case, this lag is measured in minutes, but can climb to hours occasionally. This is a case of prioritizing stability and correctness (ingesting all edits) over update latency. And while we can work to reduce the maximum latency, this will still be an asynchronous process and needs to be considered as such.

We currently have one Blazegraph expert working with us to address a number of performance and stability issues. We are planning to hire an additional engineer to help us support the service in the long term. You can follow our current work in phabricator [2].

If anyone has experience with scaling large graph databases, please reach out to us, we're always happy to share ideas!

Thanks all for your patience!
Guillaume
[1] https://wikitech.wikimedia.org/wiki/Wikidata_query_service/ScalingStrategy [2] https://phabricator.wikimedia.org/project/view/1239/

-- All the best, Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

-- All the best, Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center at the Institute for Applied Informatics (InfAI) at Leipzig University Executive Director of the DBpedia Association Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt Homepage: http://aksw.org/SebastianHellmann Research Group: http://aksw.org

Andra Waagmeester

12 Jun 12 Jun

12:06 a.m.

On Tue, Jun 11, 2019 at 11:23 AM Jerven Bolleman et al wrote:

...

...
...
So we are playing the game since ten years now: Everybody tries other

databases, but then most people come back to virtuoso.

Nothing bad about virtuoso, on the contrary, they are a prime infrastructure provider (Except maybe their trademark SPARQL query: "select distinct ?Concept where {[] a ?Concept}" ;). But I personally think that replacing the current WDS with virtuoso would be a bad idea. Not from a performance perspective, but more from the signal it gives. If indeed as you state virtuoso is the only viable solution in the field, this field is nothing more than a niche. We really need more competition to get things done. Since both DBpedia and UniProt are indeed already running on Virtuoso - where it is doing a prime job -, having Wikidata running on another vendor's infrastructure does provide us with the so needed benchmark. The benchmark seems to be telling some of us already that there is room for other alternatives. So it is fulfilling its benchmarks role. Is there really no room for improvement with Blazegraph? How about graphDB?

Marco Neumann

12:47 a.m.

and of course not to forget the fully open source SPARQL 1.1 compliant RDF database Apache Jena with TDB. Did you already evaluate Apache Jena for use in wikidata?

On Tue, Jun 11, 2019 at 5:07 PM Andra Waagmeester andra@micel.io wrote:

...

On Tue, Jun 11, 2019 at 11:23 AM Jerven Bolleman et al wrote:

...
...
...
So we are playing the game since ten years now: Everybody tries other

databases, but then most people come back to virtuoso.

Nothing bad about virtuoso, on the contrary, they are a prime infrastructure provider (Except maybe their trademark SPARQL query: "select distinct ?Concept where {[] a ?Concept}" ;). But I personally think that replacing the current WDS with virtuoso would be a bad idea. Not from a performance perspective, but more from the signal it gives. If indeed as you state virtuoso is the only viable solution in the field, this field is nothing more than a niche. We really need more competition to get things done. Since both DBpedia and UniProt are indeed already running on Virtuoso - where it is doing a prime job -, having Wikidata running on another vendor's infrastructure does provide us with the so needed benchmark. The benchmark seems to be telling some of us already that there is room for other alternatives. So it is fulfilling its benchmarks role. Is there really no room for improvement with Blazegraph? How about graphDB?

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- --- Marco Neumann KONA

Kingsley Idehen

10:42 p.m.

On 6/11/19 12:06 PM, Andra Waagmeester wrote:

...

On Tue, Jun 11, 2019 at 11:23 AM Jerven Bolleman et al wrote:
>>  So we are playing the game since ten years now: Everybody
tries other databases, but then most people come back to virtuoso. 
Nothing bad about virtuoso, on the contrary, they are a prime infrastructure provider (Except maybe their trademark SPARQL query: "select distinct ?Concept where {[] a ?Concept}" ;). But I personally think that replacing the current WDS with virtuoso would be a bad idea. Not from a performance perspective, but more from the signal it gives. If indeed as you state virtuoso is the only viable solution in the field, this field is nothing more than a niche. We really need more competition to get things done. Since both DBpedia and UniProt are indeed already running on Virtuoso

where it is doing a prime job -, having Wikidata running on another

vendor's infrastructure does provide us with the so needed benchmark. The benchmark seems to be telling some of us already that there is room for other alternatives. So it is fulfilling its benchmarks role. Is there really no room for improvement with Blazegraph? How about graphDB?

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Hi Andra,

The goal is to provide a solution to a problem. Unfortunately, it has ended up in a product debate. I've struggle with the logic about a demonstrable solution being challenged by a lack of alternatives.

The fundamental goal of Linked Data is to enable Data Publication and Access that applies capabilities delivered by HTTP to modern Data Access, Integration, and Management.

The Linked Data meme outlining Linked Data principles has existed since 2006. Like others, we digested the meme and applied it to our existing SQL RDBMS en route to producing a solution that made the vision in the paper reality, as demonstrated by DBpedia, DBpedia-Live, Uniprot, our LOD Cloud Cache, and many other nodes in the massive LOD Cloud [1].

Virtuoso's role in the LOD Cloud is an example of what happens when open standards are understood and appropriately applied to a problem, with a little innovation.

Links:

[1] https://medium.com/virtuoso-blog/what-is-the-linked-open-data-cloud-and-why-... -- What is the LOD Cloud, and why is it important?

[2] https://medium.com/virtuoso-blog/what-is-small-data-and-why-is-it-important-... -- What is Small Data, and why is it important?

Stas Malyshev

14 Jun 14 Jun

7:36 a.m.

Hi!

...

Unlike, most sites we do have our own custom frontend in front of virtuoso. We did this to allow more styling, as well as being flexible and change implementations at our whim. e.g. we double parse the SPARQL queries and even rewrite some to be friendlier. I suggest you do the same no matter which DB you use in the end, and we would be willing to open source ours (it is in Java, and uses RDF4J and some ugly JSPX but it works, if not to use at least as an inspiration). We did this to avoid being locked into endpoint specific features.

It would be interesting to know more about this, if this is open source. Is there any more information about it online?

...

Pragmatically, while WDS is a Graph database, the queries are actually very relational. And none of the standard graph algorithms are used. To

If you mean algorithms like A* or PageRank, then yes, they are not used too much (likely also because SPARQL has no standard support for any of these, too), though Blazegraph implements some of them as custom services.

...

be honest RDF is actually a relational system which means that relational techniques are very good at answering them. The sole issue is recursive queries (e.g. rdfs:subClassOf+) in which the virtuoso implementation is adequate but not great.

Yes, path queries are pretty popular on WDQS too, especially given as many relationships like administrative/territorial placement or ownership are hierarchical and transitive, which often requires path queries.

...

This is why recovering physical schemata from RDF data is such a powerful optimization technique [1]. i.e. you tend to do joins not traversals. This is not always true but I strongly suspect it will hold for the vast majority of the Wikidata Query Service case.

Would be interesting to see if we can apply anything from the article. Thanks for the link!

-- Stas Malyshev smalyshev@wikimedia.org

Kingsley Idehen

11 Jun 11 Jun

10:29 p.m.

On 6/10/19 3:49 PM, Guillaume Lederrey wrote:

...

On Mon, Jun 10, 2019 at 9:03 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

...
Hi Guillaume,

On 10.06.19 16:54, Guillaume Lederrey wrote:

Hello!

On Mon, Jun 10, 2019 at 4:28 PM Sebastian Hellmann hellmann@informatik.uni-leipzig.de wrote:

Hi Guillaume,

On 06.06.19 21:32, Guillaume Lederrey wrote:

Hello all!

There has been a number of concerns raised about the performance and scaling of Wikdata Query Service. We share those concerns and we are doing our best to address them. Here is some info about what is going on:

In an ideal world, WDQS should:

scale in terms of data size

scale in terms of number of edits

have low update latency

expose a SPARQL endpoint for queries

allow anyone to run any queries on the public WDQS endpoint

provide great query performance

provide a high level of availability

Scaling graph databases is a "known hard problem", and we are reaching a scale where there are no obvious easy solutions to address all the above constraints. At this point, just "throwing hardware at the problem" is not an option anymore. We need to go deeper into the details and potentially make major changes to the current architecture. Some scaling considerations are discussed in [1]. This is going to take time.

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

Yes, I am admitting that Blazegraph (at least in the way we are using it at the moment) does not scale to our future needs. Blazegraph does have support for sharding (what they call "Scale Out"). And yes, we need to have a closer look at how that works. I'm not the expert here, so I won't even try to assert if that's a viable solution or not.

Yes, sharding is what you need, I think, instead of replication. This is the technique where data is repartitioned into more manageable chunks across servers.

Well, we need sharding for scalability and replication for availability, so we do need both. The hard problem is sharding.

...
Here is a good explanation of it:

http://vos.openlinksw.com/owiki/wiki/VOS/VOSArticleWebScaleRDF

Interesting read. I don't see how Virtuoso addresses data locality, it looks like sharding of their RDF store is just hash based (I'm assuming some kind of uniform hash).

It handles data locality across a shared nothing cluster just fine i.e., you can interact with any node in a Virtuoso cluster and experience identical behavior (everyone node looks like single node in the eyes of the operator).

...

I'm not enough of an expert on graph databases, but I doubt that a highly connected graph like Wikidata will be able to scale reads without some way to address data locality. Obviously, this needs testing.

...
http://docs.openlinksw.com/virtuoso/ch-clusterprogramming/

There are live instances of Virtuoso that demonstrate its capabilities. If you want to explore shared-nothing cluster capabilities then our live LOD Cloud cache is the place to start [1][2][3]. If you want to see the single-server open source edition that you have DBpedia, DBpedia-Live, Uniprot and many other nodes in the LOD Cloud to choose from. All of these instance are highly connected.

If you want to get into the depths of Linked Data regarding query processing pipelines that include URI (or Super Key) de-reference, you can take a look at our URIBurner Service [4][5].

Virtuoso handles both shared-nothing clusters and replication i.e., you can have a cluster configuration used in conjunction with a replication topology if your solution requires that.

Virtuoso is a full-blown SQL RDBMS that leverages SPARQL and a SQL extension for handling challenges associated with Entity Relationship Graphs represented as RDF statement collections. You can even use SPARQL inside SQL from any ODBC- or JDBC-compliant app or service etc..

Links:

[1] http://lod.openlinksw.com

[2] https://twitter.com/search?f=tweets&vertical=default&q=%23PermID%20%... -- query samplings via links included in tweets

[3] https://tinyurl.com/y47prg9h -- SPARQL transitive option applied to a skos taxonomy tree

[4] https://linkeddata.uriburner.com -- this service provides Linked Data transformation combined with an ability to de-ref URI-variables and URI-constants in the body of a query as part of the solution production pipeline; it also includes a service that adds image processing to the aforementioned pipeline via the PivotViewer module for data visualization

[5] https://medium.com/virtuoso-blog/what-is-small-data-and-why-is-it-important-... -- About Small Data (use of URI-dereference to tackle thorny data access challenges by leveraging the power of HTTP URIs as Super Keys)

Stas Malyshev

14 Jun 14 Jun

7:52 a.m.

Hi!

...

It handles data locality across a shared nothing cluster just fine i.e., you can interact with any node in a Virtuoso cluster and experience identical behavior (everyone node looks like single node in the eyes of the operator).

Does this mean no sharding, i.e. each server stores the full DB? This is the model we're using currently, but given the growth of the data it may be non sustainable on current hardware. I see in your tables that Uniprot has about 30B triples, but I wonder how update loads there look like. Our main issue is that the hardware we have now is showing its limits when there's a lot of updates in parallel to significant query load. So I wonder if the "single server holds everything" model is sustainable in the long term.

...

There are live instances of Virtuoso that demonstrate its capabilities. If you want to explore shared-nothing cluster capabilities then our live LOD Cloud cache is the place to start [1][2][3]. If you want to see the single-server open source edition that you have DBpedia, DBpedia-Live, Uniprot and many other nodes in the LOD Cloud to choose from. All of these instance are highly connected.

Again, here the question is not too much in "can you load 7bn triples into Virtuoso" - we know we can. What we want to figure out whether given specific query/update patterns we have now - it is going to give us significantly better performance allowing to support our projected growth. And also possibly whether Virtuoso has ways to make our update workflow be more optimal - e.g. right now if one triple changes in Wikidata item, we're essentially downloading and updating the whole item (not exactly since triples that stay the same are preserved but it requires a lot of data transfer to express that in SPARQL). Would there be ways to update the things more efficiently?

...

Virtuoso handles both shared-nothing clusters and replication i.e., you can have a cluster configuration used in conjunction with a replication topology if your solution requires that.

Replication could certainly be useful I think it it's faster to update single server and then replicate than simultaneously update all servers (that's what is happening now).

-- Stas Malyshev smalyshev@wikimedia.org

Ted Thibodeau Jr

18 Jun 18 Jun

2:55 a.m.

Hello, Stas --

On Jun 13, 2019, at 07:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...

Hi!

...
It handles data locality across a shared nothing cluster just fine i.e., you can interact with any node in a Virtuoso cluster and experience identical behavior (everyone node looks like single node in the eyes of the operator).

Does this mean no sharding, i.e. each server stores the full DB?

No.

The full DB is automatically sharded across all Virtuoso instances in an Elastic Cluster, and each instance *appears* to store the full DB -- i.e., you can issue a query to any instance in an Elastic Cluster, if you have the relevant communication details (typically IP address and port number), and you will get the same results from it as from any other instance in that Elastic Cluster.

(I am generally specific about Elastic Cluster vs Replication Cluster, because these are different though complementary technologies, implemented via different Modules in Virtuoso.)

...

This is the model we're using currently, but given the growth of the data it may be non sustainable on current hardware. I see in your tables that Uniprot has about 30B triples, but I wonder how update loads there look like. Our main issue is that the hardware we have now is showing its limits when there's a lot of updates in parallel to significant query load. So I wonder if the "single server holds everything" model is sustainable in the long term.

Your questions are unsurprising, and are one of the reasons for the benchmark efforts of the LDBC --

http://ldbcouncil.org/benchmarks/

Uniprot does not get a lot of updates, and it is running on a single instance -- i.e., there's no cluster involved at all, neither Elastic (Shared-Nothing) Cluster nor Replication Cluster -- so its probably not the best example for your workflows.

I think the LDBC's Social Networking Benchmark (SNB) is likely to be the closest to the Wikidata update and query patterns, so you may find these articles interesting --

1. SNB Interactive, Part 1: What is SNB Interactive Really About? https://virtuoso.openlinksw.com/blog/vdb/blog/?id=1835

2. SNB Interactive, Part 2: Modeling Choices https://virtuoso.openlinksw.com/blog/vdb/blog/?id=1837

3. SNB Interactive, Part 3: Choke Points and Initial Run on Virtuoso https://virtuoso.openlinksw.com/blog/vdb/blog/?id=1842

...

...
There are live instances of Virtuoso that demonstrate its capabilities. If you want to explore shared-nothing cluster capabilities then our live LOD Cloud cache is the place to start [1][2][3]. If you want to see the single-server open source edition that you have DBpedia, DBpedia-Live, Uniprot and many other nodes in the LOD Cloud to choose from. All of these instance are highly connected.

Again, here the question is not too much in "can you load 7bn triples into Virtuoso" - we know we can. What we want to figure out whether given specific query/update patterns we have now - it is going to give us significantly better performance allowing to support our projected growth. And also possibly whether Virtuoso has ways to make our update workflow be more optimal - e.g. right now if one triple changes in Wikidata item, we're essentially downloading and updating the whole item (not exactly since triples that stay the same are preserved but it requires a lot of data transfer to express that in SPARQL). Would there be ways to update the things more efficiently?

The first thing that will improve your performance is to break out of the "stored as JSON blobs" pattern you've been using.

Updates should not require a full download of the named graph (which I think is what your JSON Blobs amount to) followed by an upload of the entire revised named graph.

Even if you *query* the full content of an existing named graph, determine the necessary changes locally, and then submit an update query which includes a full set of DELETE + INSERT statements (this "full set" only including the *changed* triples), you should find a significant reduction in data throughput.

The live parallel to such regular updates is DBpedia-Live, which started from a static load of dump files, and has been (and is still) continuously updated by an RDF feed based on the Wikipedia update firehose. The same RDF feed is made available to users of our AMI-based DBpedia-Live mirror AMI (currently being refreshed, and soon to be made available for new users) --

https://aws.amazon.com/marketplace/pp/B012DSCFEK

...

...
Virtuoso handles both shared-nothing clusters and replication i.e., you can have a cluster configuration used in conjunction with a replication topology if your solution requires that.

Replication could certainly be useful I think it it's faster to update single server and then replicate than simultaneously update all servers (that's what is happening now).

There are multiple Replication strategies which might be used, as well as multiple Replication Cluster topologies which might be considered, and none of them is inherently the fastest.

That said, periodic monolithic replication of an entire dataset or DB would certainly not be faster than propagation of DIFFs from the master to the replica(s). Replication via periodic cumulative DIFFs *may* be faster than incremental DIFFs that are dispatched after every change, but this depends on many variables.

This page of cluster topology diagrams starts with Replication-only and progresses to Elastic-only. (There are no illustrations of a combined Replicating-Elastic-Cluster on this page.)

http://vos.openlinksw.com/owiki/wiki/VOS/VirtClusteringDiagrams

Any Replication Cluster topology and methodology -- including zero Replication -- may be combined with an Elastic (Shared-Nothing) Cluster setup. Generally speaking, when these are combined, an entire Elastic Cluster would take the place of each Single-Server Instance in a given Replication topology.

I hope this helps your understanding of the available options.

Ted

Ted Thibodeau, Jr. // voice +1-781-273-0900 x32 Senior Support & Evangelism // mailto:tthibodeau@openlinksw.com // http://twitter.com/TallTed OpenLink Software, Inc. // http://www.openlinksw.com/ 20 Burlington Mall Road, Suite 322, Burlington MA 01803 Weblog -- http://www.openlinksw.com/blogs/ Community -- https://community.openlinksw.com/ LinkedIn -- http://www.linkedin.com/company/openlink-software/ Twitter -- http://twitter.com/OpenLink Facebook -- http://www.facebook.com/OpenLinkSoftware Universal Data Access, Integration, and Management Technology Providers

Finn Aarup Nielsen

3:41 a.m.

Changing the subject a bit:

I am surprised to see how many SPARQL requests go to the endpoint when performing a ShEx validation with the shex-simple Toolforge tool. They are all very simple and quickly complete. For each Wikidata item tested, one of our tests [1] requests tens of times. That is, testing 100 Wikidata items may yield thousands of requests to the endpoint in rapid succession.

I suppose that given the simple SPARQL queries, these kinds of requests might not load WDQS very much.

[1] https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex...

Finn http://people.compute.dtu.dk/faan/

Eric Prud'hommeaux

20 Jun 20 Jun

3:05 a.m.

On Mon, Jun 17, 2019 at 09:41:51PM +0200, Finn Aarup Nielsen wrote:

...

Changing the subject a bit:

I am surprised to see how many SPARQL requests go to the endpoint when performing a ShEx validation with the shex-simple Toolforge tool. They are all very simple and quickly complete. For each Wikidata item tested, one of our tests [1] requests tens of times. That is, testing 100 Wikidata items may yield thousands of requests to the endpoint in rapid succession.

I suppose that given the simple SPARQL queries, these kinds of requests might not load WDQS very much.

It's true; they require no joins are are designed to be answerable by only looking at the index. That said, given that they offer virtually no load, running them with API access to the Blaze getStatements() [2] would make validation thousands of times faster and eliminate parsing and query planning time on the SPARQL server.

...

[1] https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex...

[2] https://www.programcreek.com/java-api-examples/?class=org.eclipse.rdf4j.repo...

...

Finn http://people.compute.dtu.dk/faan/

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Thibaut DEVERAUX

22 Jun 22 Jun

6:55 a.m.

Dear,

I've seen this suggestion on Quora : https://www.quora.com/Wouldnt-a-mix-database-system-that-handle-both-JSON-do...

I'm not qualified enough to know if it is relevant but this could be some brainstorming.

Regards

Le mer. 19 juin 2019 à 19:45, Finn Aarup Nielsen faan@dtu.dk a écrit :

...

Changing the subject a bit:

I am surprised to see how many SPARQL requests go to the endpoint when performing a ShEx validation with the shex-simple Toolforge tool. They are all very simple and quickly complete. For each Wikidata item tested, one of our tests [1] requests tens of times. That is, testing 100 Wikidata items may yield thousands of requests to the endpoint in rapid succession.

I suppose that given the simple SPARQL queries, these kinds of requests might not load WDQS very much.

[1]

https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex...

Finn http://people.compute.dtu.dk/faan/

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Marco Neumann

5:13 p.m.

Thibaut, while it's certainly exciting to see continued work on the development of storage solutions and hybrids are most likely part of the future story here I'd also would to stay as close as possible to existing Semantic Web / Linked Data standards like RDF and SPARQL to guarantee interop and extensibility.

no matter what mix of underlying tech is being deployed here under the hood.

On Fri, Jun 21, 2019 at 11:56 PM Thibaut DEVERAUX < thibaut.deveraux@gmail.com> wrote:

...

Dear,

I've seen this suggestion on Quora :

https://www.quora.com/Wouldnt-a-mix-database-system-that-handle-both-JSON-do...

I'm not qualified enough to know if it is relevant but this could be some brainstorming.

Regards

Le mer. 19 juin 2019 à 19:45, Finn Aarup Nielsen faan@dtu.dk a écrit :

...
Changing the subject a bit:

I am surprised to see how many SPARQL requests go to the endpoint when performing a ShEx validation with the shex-simple Toolforge tool. They are all very simple and quickly complete. For each Wikidata item tested, one of our tests [1] requests tens of times. That is, testing 100 Wikidata items may yield thousands of requests to the endpoint in rapid succession.

I suppose that given the simple SPARQL queries, these kinds of requests might not load WDQS very much.

[1]

https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex...

Finn http://people.compute.dtu.dk/faan/

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

-- --- Marco Neumann KONA

Thad Guidry

10:22 p.m.

In the enterprise where I work as a Data Architect, we approach scaling in many ways, but there's no question that the age old technique of SORTING lines up everything for systems and cpu's to massively ingest and pipeline across IO boundaries. Sometimes this involves more indices, lots of duplicated data, and scatter/gather techniques. Knowing WHAT to sort and HOW to sort will vary widely by the queries that are expected to perform well from a system. So sorted indices of many kinds (where data is duplicated) are necessary to achieve extremely fast IO for a broad database such as Wikidata.

Scaling problems can be categorized in a few buckets: 1. Increase of data queries. (READ) 2. Increase of data writes. (WRITE) 3. There is no 3. Because all scale problems boil down to IO (READ/WRITE) and how you approach fast IO.

Google is known to replicate data at different levels of abstraction (metaschema, indices, meta-relations) across entire regions of the world in order to achieve fast IO. With a nearly unlimited budget they can MOVE THINGS FAST certainly and afford to be extremely wasteful and smart with data replication techniques.

IBM approaches the scale problem via Polymorphic stores that support multiple indices, db structures, both in-memory and graph-like. Essentially, duplicating the hell out of the data in many, many ways and wasting space and memory to result in extremely high performance on queries. https://queue.acm.org/detail.cfm?id=3332266

Juan Sequeda (now at data.world) and team at Capsenta also seem to use polymorpic storage to bridge SPARQL and relational DB's. But I'm unsure of the actual architecture but would love to hear more about it. I've followed Juan for some time. https://www.zdnet.com/article/data-world-joins-forces-with-capsenta-to-bring...

It is unfortunate that Wikidata doesn't have the hardware resources to duplicate and sort data in myriad ways to achieve better scale. On the software(s) side, we all know what the capabilities are of various stacks, but we often don't have the "time" or "hardware" to truly flex the "software" stack muscles to allow fast IO.

Thad https://www.linkedin.com/in/thadguidry/

Ted Thibodeau Jr

25 Jun 25 Jun

9:40 p.m.

On Jun 17, 2019, at 03:41 PM, Finn Aarup Nielsen faan@dtu.dk wrote:

...

Changing the subject a bit:

Well... Changing the subject a *lot*, to an extent probably worthy of its own subject line, and an entirely new thread, not only because it seems more relevant to the "shex-simple Toolforge tool" you reference, than to anything in this thread about scaling the back-end.

Ted

...

I am surprised to see how many SPARQL requests go to the endpoint when performing a ShEx validation with the shex-simple Toolforge tool. They are all very simple and quickly complete. For each Wikidata item tested, one of our tests [1] requests tens of times. That is, testing 100 Wikidata items may yield thousands of requests to the endpoint in rapid succession.

I suppose that given the simple SPARQL queries, these kinds of requests might not load WDQS very much.

[1] https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex...

Stas Malyshev

11 Jun 11 Jun

4:25 a.m.

Hi!

...

Yes, sharding is what you need, I think, instead of replication. This is the technique where data is repartitioned into more manageable chunks across servers.

Agreed, if we are to get any solution that is not constrained by hardware limits of a single server, we can not avoid looking at sharding.

...

Here is a good explanation of it:

http://vos.openlinksw.com/owiki/wiki/VOS/VOSArticleWebScaleRDF

Thanks, very interesting article. I'd certainly would like to know how this works with database in the size of 10 bln. triples and queries both accessing and updating random subsets of them. Updates are not covered very thoroughly there - this is, I suspect, because many databases of 10 bln. size do not have as active (non-append) update workload as we do. Maybe they still manage to solve it, if so, I'd very much like to know about it.

...

Just a note here: Virtuoso is also a full RDMS, so you could probably keep wikibase db in the same cluster and fix the asynchronicity. That is

Given how the original data is stored (JSON blob inside mysql table) it would not be very useful. In general, graph data model and Wikitext data model on top of which Wikidata is built are very, very different, and expecting same storage to serve both - at least without very major and deep refactoring of the code on both sides - is not currently very realistic. And of course moving any of the wiki production databases to Virtuoso would be a non-starter. Given than original Wikidata database stays on Mysql - which I think is a reasonable assumption - there would need to be a data migration pipeline for data to come from Mysql to whatever is the WDQS NG storage.

...

also true for any mappers like Sparqlify: http://aksw.org/Projects/Sparqlify.html However, these shift the problem, then you need a sharded/repartitioned relational database....

Yes, relational-RDF bridges are known but my experience is they usually are not very performant (the difference in "you can do it" and "you can do it fast" is sometimes very significant) and in our case, it would be useless anyway as Wikidata data is not really stored in relational database per se - it's stored in JSON blob opaquely saved in relational database structure that knows nothing about Wikidata. Yes, it's not the ideal structure for optimal performance of Wikidata itself, but I do not foresee this changing, at least in any short term. Again, we could of course have data export pipeline to whatever storage format we want - essentially we already have one - but the concept of having single data store is probably not realistic at least within foreseeable timeframes. We use separate data store for search (ElasticSearch) and probably will have to have separate one for queries, whatever would be the mechanism.

Thanks,

-- Stas Malyshev smalyshev@wikimedia.org

Kingsley Idehen

10:42 p.m.

On 6/10/19 4:25 PM, Stas Malyshev wrote:

...

...
Just a note here: Virtuoso is also a full RDMS, so you could probably keep wikibase db in the same cluster and fix the asynchronicity. That is

Given how the original data is stored (JSON blob inside mysql table) it would not be very useful. In general, graph data model and Wikitext data model on top of which Wikidata is built are very, very different, and expecting same storage to serve both - at least without very major and deep refactoring of the code on both sides - is not currently very realistic. And of course moving any of the wiki production databases to Virtuoso would be a non-starter. Given than original Wikidata database stays on Mysql - which I think is a reasonable assumption - there would need to be a data migration pipeline for data to come from Mysql to whatever is the WDQS NG storage.

Hi Stas,

Data living in an RDBMS engine distinct from Virtuoso is handled via the engines Virtual Database module i.e., you can build powerful RDF Views over ODBC- or JDBC- accessible data using Virtuoso. These view also have the option of being materialized etc..

[1] https://medium.com/virtuoso-blog/conceptual-data-virtualization-for-sql-and-... -- Conceptual Data Virtualization using Virtuoso

[2] https://medium.com/virtuoso-blog/generate-relational-tables-to-rdf-relationa... -- RDF Views generation over SQL RDBMS data sources using the Virtuoso Wizard

Stas Malyshev

14 Jun 14 Jun

7:55 a.m.

Hi!

...

Data living in an RDBMS engine distinct from Virtuoso is handled via the engines Virtual Database module i.e., you can build powerful RDF Views over ODBC- or JDBC- accessible data using Virtuoso. These view also have the option of being materialized etc..

Yes, but the way the data are stored now is JSON blob within a text field in MySQL. I do not see how RDF View over ODBC would help it any - of course Virtuoso would be able to fetch JSON text for a single item, but then what? We'd need to run queries across millions of items, fetching and parsing JSON for every one of them every time is unfeasible. Not to mention this JSON is not an accurate representation of the RDF data model. So I don't think it is worth spending time in this direction... I just don't see how any query engine could work with that storage.

-- Stas Malyshev smalyshev@wikimedia.org

Kingsley Idehen

9:54 p.m.

On 6/13/19 7:55 PM, Stas Malyshev wrote:

...

Hi!

...
Data living in an RDBMS engine distinct from Virtuoso is handled via the engines Virtual Database module i.e., you can build powerful RDF Views over ODBC- or JDBC- accessible data using Virtuoso. These view also have the option of being materialized etc..

Yes, but the way the data are stored now is JSON blob within a text field in MySQL. I do not see how RDF View over ODBC would help it any - of course Virtuoso would be able to fetch JSON text for a single item, but then what? We'd need to run queries across millions of items, fetching and parsing JSON for every one of them every time is unfeasible. Not to mention this JSON is not an accurate representation of the RDF data model. So I don't think it is worth spending time in this direction... I just don't see how any query engine could work with that storage. -- Stas Malyshev smalyshev@wikimedia.org

The point I am trying to make is that Virtuoso can integrate data from external DBMS systems in a variety of ways.

ODBC and JDBC are simply APIs for accessing external DBMS systems.

What you really need here is a clear project definition and a discussion with us about how it would be implemented.

Despite the fact that Virtuoso is a hardcore DBMS, its also a hardcore Data Virtualization platform for handling relations represented in a variety of ways using a plethora of protocols.

I am email away if you want to explore this further.

Kingsley Idehen

11 Jun 11 Jun

10:09 p.m.

On 6/10/19 10:54 AM, Guillaume Lederrey wrote:

...

...

Virtuoso has proven quite useful. I don't want to advertise here, but the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I'm not entirely sure how to read the above (and a quick look at virtuoso website does not give me the answer either), but it looks like the sharding / partitioning options are only available in the enterprise version. That probably makes it a non starter for us.

Virtuoso Cluster Edition is as described by Sebastian in an earlier post to this thread [1]. Online that's behind our LOD Cloud cache which hosts 40 Billion+ triples, but still using ridiculously cheap hard-ware for the share-nothing cluster.

As Jerven has already articulated [2], the single-server open source edition of Virtuoso can also scale to 40 Billion+ triples as demonstrated by Uniprot amongst others.

There's a publicly available Google Spreadsheet that provides insights into a variety of Virtuoso configurations that you can also look at regarding resource requirements [3].

Bottom line, Virtuoso has no fundamental issues with performance, scale, or security (most haven't hit this bump yet, but its coming!) regarding RDF-data deployed in line with Linked Data principles.

We are always opened to collaboration with anyone (or group) seeking to fully exploit the power and promise of a Semantic Web derived from Linked Data :)

Links:

[1] https://lists.wikimedia.org/pipermail/wikidata/2019-June/013132.html -- Sebastian Hellman comment

[2] https://lists.wikimedia.org/pipermail/wikidata/2019-June/013143.html -- Jerven Bolleman comment

[3] https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5O... https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit?ouid=112399767740508618350&usp=sheets_home&ths=true -- Virtuoso configurations sample spreadsheet

[4] https://hub.docker.com/u/openlink/ -- Docker Hub offerings

[5] https://aws.amazon.com/marketplace/pp/B00ZWMSNOG -- Amazon Marketplace BYOL Edition

[6] https://aws.amazon.com/marketplace/pp/B011VMCZ8K -- Amazon Marketplace PAGO Edition

[7] https://github.com/openlink/virtuoso-opensource -- Github

[8] http://download.openlinksw.com -- Download Site

Stas Malyshev

4:02 a.m.

Hi!

...

I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

I think this is over-generalizing. We have a database that grew 10x over the last 4 years. We have certain hardware and software limits, both with existing hardware and in principle by hardware we could buy. We also have certain issues specific to graph databases that make scaling harder - for example, document databases, like ElasticSearch, and certain models of relational databases, shard easily. Sharding something like Wikidata graph is much harder, especially if the underlying database knows nothing about specifics of Wikidata data (which would be the case for all off-the-shelf databases). If we just randomly split the triples between several servers, we'd probably be just modeling a large but an extremely slow disk. So there needs to be some smarter solution, one that we'd unlike to develop inhouse but one that has already been verified by industry experience and other deployments.

Is the issue specific to Blazegraph and can the issue be solved by switching platform? Maybe, we do not know yet. We do not have any better solution that guarantees us better scalability identified, but we have a plan on looking for that solution, given the resources. We also have a plan on improving the throughput of Blazegraph, which we're working on now.

Non-sharding model might be hard to sustain indefinitely, but it is not clear it can't work in the short term, and also it is not clear that sharding model would deliver clear performance win, as it will have to involve network latencies inside the queries, which can significantly affect performance. This can only resolved by proper testing evaluation of the candidate solutions.

...

Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does

I do not think our time here would be productively spent arguing semantics about what should and should not be called a "cluster". We call that setup a cluster, and I think now we all understand what we're talking about.

...

it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

If you mean sharded or replicated setup, as far as I know, Blazegraph does not support that (there's some support for replication IIRC but replication without sharding probably won't give us much improvement). We have a plan to evaluate a solution that does shard, given necessary resources.

...

Some info here:

We evaluated some stores according to their performance:

http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

Thanks for the link, it looks very interesting, I'll read it and see which parts we could use here.

...

Virtuoso has proven quite useful. I don't want to advertise here, but

the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I do not know the details of your usage scenario, so before we get into comparisons, I'd like to understand:

1. Do your servers provide live synchronized updates with Wikdiata or DBPedia? How many updates per second that server can process? 2. How many queries per second this server is serving? What kind of queries are those?

We did preliminary very limited evaluation of Virtuoso for hosting Wikidata, and it looks like it can load and host the necessary data (though it does not support some customizations we have now and we could not evaluate whether such customizations are possible) but it would require significant time investment to port all the functionality to it. Unfortunately, the lack of resources did not allow us to do fuller evaluation.

Also, as I understand, "professional" capabilities of Virtuoso are closed-source and require paid license, which probably would be a problem to run it on WMF infrastructure unless we reach some kind of special arrangement. Since this arrangement will probably not include open-sourcing the enterprise part of Virtuoso, it should deliver a very significant, I dare say enormous advantage for us to consider running it in production. It may be possible that just OS version is also clearly superior to the point that it is worth migrating, but this needs to be established by evaluation.

...

I recently heard a presentation from Arango-DB and they had a good

cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

We considered AgangoDB in the past, and it turned out we couldn't use it efficiently on the scales we need (could be our fault of course). They also use their own proprietary language for querying, which might be worth it if they deliver us a clear win on all other aspects, but that does not seem to be the case. Also, AgangoDB seems to be document database inside. This is not what our current data model is. While it is possible to model Wikidata in this way, again, changing the data model from RDF/SPARQL to a different one is an enormous shift, which can only be justified by an equally enormous improvement in some other areas, which currently is not clear. This project seems to be still very young. While I would be very interested if somebody took on themselves to model Wikidata in terms of ArangoDB documents, load the whole data and see what the resulting performance would be, I am not sure it would be wise for us to invest our team's - very limited currently - resources into that.

Thanks,

-- Stas Malyshev smalyshev@wikimedia.org

Sebastian Hellmann

4:36 a.m.

Hi Stas,

thanks for the elaboration. I can understand the background much better. I have to admit, that I am also not a real expert, but very close to the real experts like Vidal and Rahm who are co-authors of the SWJ paper or the OpenLink devs.

I am also spoiled, because OpenLink solves the hosting for DBpedia and also DBpedia-live with ca. 130k updates per day for the English Wikipedia. I think, this one is the recent report: https://medium.com/virtuoso-blog/dbpedia-usage-report-as-of-2018-01-01-8cae1... Then again DBpedia didn't grow for a while, but we made a "Best of" now [1]. But will not host it all.

[1] https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf

I also see that your context is difficult. Maybe you can custom shard/scaleout Blazegraph based on the queries and then replicate the sharded clusters. Like a mix between sharding and replication, maybe just 3 * 3 servers instead of 9 replicated ones or 9 servers full of shards. Not so many options here, if you have the OS requirement. I guess you are caching static content already as much as possible.

This also seems pretty much what I know, but it really is all second hand as my expertise is more focused on what's inside the database.

All the best,

Sebastian

On 10.06.19 22:02, Stas Malyshev wrote:

...

Hi!

...
I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.

I think this is over-generalizing. We have a database that grew 10x over the last 4 years. We have certain hardware and software limits, both with existing hardware and in principle by hardware we could buy. We also have certain issues specific to graph databases that make scaling harder - for example, document databases, like ElasticSearch, and certain models of relational databases, shard easily. Sharding something like Wikidata graph is much harder, especially if the underlying database knows nothing about specifics of Wikidata data (which would be the case for all off-the-shelf databases). If we just randomly split the triples between several servers, we'd probably be just modeling a large but an extremely slow disk. So there needs to be some smarter solution, one that we'd unlike to develop inhouse but one that has already been verified by industry experience and other deployments.

Is the issue specific to Blazegraph and can the issue be solved by switching platform? Maybe, we do not know yet. We do not have any better solution that guarantees us better scalability identified, but we have a plan on looking for that solution, given the resources. We also have a plan on improving the throughput of Blazegraph, which we're working on now.

Non-sharding model might be hard to sustain indefinitely, but it is not clear it can't work in the short term, and also it is not clear that sharding model would deliver clear performance win, as it will have to involve network latencies inside the queries, which can significantly affect performance. This can only resolved by proper testing evaluation of the candidate solutions.

...
Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does

I do not think our time here would be productively spent arguing semantics about what should and should not be called a "cluster". We call that setup a cluster, and I think now we all understand what we're talking about.

...
it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?

If you mean sharded or replicated setup, as far as I know, Blazegraph does not support that (there's some support for replication IIRC but replication without sharding probably won't give us much improvement). We have a plan to evaluate a solution that does shard, given necessary resources.

...
Some info here:

We evaluated some stores according to their performance:

http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"

Thanks for the link, it looks very interesting, I'll read it and see which parts we could use here.

...

Virtuoso has proven quite useful. I don't want to advertise here, but

the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.

I do not know the details of your usage scenario, so before we get into comparisons, I'd like to understand:

Do your servers provide live synchronized updates with Wikdiata or

DBPedia? How many updates per second that server can process? 2. How many queries per second this server is serving? What kind of queries are those?

We did preliminary very limited evaluation of Virtuoso for hosting Wikidata, and it looks like it can load and host the necessary data (though it does not support some customizations we have now and we could not evaluate whether such customizations are possible) but it would require significant time investment to port all the functionality to it. Unfortunately, the lack of resources did not allow us to do fuller evaluation.

Also, as I understand, "professional" capabilities of Virtuoso are closed-source and require paid license, which probably would be a problem to run it on WMF infrastructure unless we reach some kind of special arrangement. Since this arrangement will probably not include open-sourcing the enterprise part of Virtuoso, it should deliver a very significant, I dare say enormous advantage for us to consider running it in production. It may be possible that just OS version is also clearly superior to the point that it is worth migrating, but this needs to be established by evaluation.

...

I recently heard a presentation from Arango-DB and they had a good

cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.

We considered AgangoDB in the past, and it turned out we couldn't use it efficiently on the scales we need (could be our fault of course). They also use their own proprietary language for querying, which might be worth it if they deliver us a clear win on all other aspects, but that does not seem to be the case. Also, AgangoDB seems to be document database inside. This is not what our current data model is. While it is possible to model Wikidata in this way, again, changing the data model from RDF/SPARQL to a different one is an enormous shift, which can only be justified by an equally enormous improvement in some other areas, which currently is not clear. This project seems to be still very young. While I would be very interested if somebody took on themselves to model Wikidata in terms of ArangoDB documents, load the whole data and see what the resulting performance would be, I am not sure it would be wise for us to invest our team's - very limited currently - resources into that.

Thanks,

Stas Malyshev

4:46 a.m.

Hi!

...

thanks for the elaboration. I can understand the background much better. I have to admit, that I am also not a real expert, but very close to the real experts like Vidal and Rahm who are co-authors of the SWJ paper or the OpenLink devs.

If you know anybody at OpenLink that would be interested in trying to evaluate such thing (i.e. how Wikidata could be hosted on Virtuso) and provide support for this project, it would be interesting to discuss it. While open-source thing is still a barrier and in general the requirements are different, at least discussing it and maybe getting some numbers might be useful.

Thanks,

-- Stas Malyshev smalyshev@wikimedia.org

Sebastian Hellmann

5:03 a.m.

Yes, I can ask. I am talking a lot with them as we are redeploying DBpedia live and also pushing the new DBpedia to them soon.

I think, they also had a specific issue with how Wikidata does linked data, but I didn't get it, as it was mentioned too briefly.

All the best,

Sebastian

On 10.06.19 22:46, Stas Malyshev wrote:

...

Hi!

...
thanks for the elaboration. I can understand the background much better. I have to admit, that I am also not a real expert, but very close to the real experts like Vidal and Rahm who are co-authors of the SWJ paper or the OpenLink devs.

If you know anybody at OpenLink that would be interested in trying to evaluate such thing (i.e. how Wikidata could be hosted on Virtuso) and provide support for this project, it would be interesting to discuss it. While open-source thing is still a barrier and in general the requirements are different, at least discussing it and maybe getting some numbers might be useful.

Thanks,

Kingsley Idehen

10:32 p.m.

On 6/10/19 4:46 PM, Stas Malyshev wrote:

...

Hi!

...
thanks for the elaboration. I can understand the background much better. I have to admit, that I am also not a real expert, but very close to the real experts like Vidal and Rahm who are co-authors of the SWJ paper or the OpenLink devs.

If you know anybody at OpenLink that would be interested in trying to evaluate such thing (i.e. how Wikidata could be hosted on Virtuso) and provide support for this project, it would be interesting to discuss it. While open-source thing is still a barrier and in general the requirements are different, at least discussing it and maybe getting some numbers might be useful.

Thanks, -- Stas Malyshev smalyshev@wikimedia.org

I am listening.

I am only a ping away.

2019

Age (days ago)

2032

Last active (days ago)

wikidata@lists.wikimedia.org

42 comments

15 participants

tags (0)

participants (15)

Amirouche Boubekki
Andra Waagmeester
Daniel Mietchen
Eric Prud'hommeaux
Finn Aarup Nielsen
Gerard Meijssen
Guillaume Lederrey
Jerven Bolleman
Kingsley Idehen
Marco Neumann
Sebastian Hellmann
Stas Malyshev
Ted Thibodeau Jr
Thad Guidry
Thibaut DEVERAUX