Hi Stas,

thanks for the elaboration. I can understand the background much better. I have to admit, that I am also not a real expert, but very close to the real experts like Vidal and Rahm who are co-authors of the SWJ paper or the OpenLink devs.

I am also spoiled, because OpenLink solves the hosting for DBpedia and also DBpedia-live with ca. 130k updates per day for the English Wikipedia. I think, this one is the recent report: https://medium.com/virtuoso-blog/dbpedia-usage-report-as-of-2018-01-01-8cae1b81ca71   Then again DBpedia didn't grow for a while, but we made a "Best of" now [1]. But  will not host it all.

[1] https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf

I also see that your context is difficult. Maybe you can custom shard/scaleout Blazegraph based on the queries and then replicate the sharded clusters. Like a mix between sharding and replication, maybe just 3 * 3 servers instead of 9 replicated ones or 9 servers full of shards. Not so many options here, if you have the OS requirement. I guess you are caching static content already as much as possible.

This also seems pretty much what I know, but it really is all second hand as my expertise is more focused on what's inside the database.

All the best,

Sebastian

On 10.06.19 22:02, Stas Malyshev wrote:
Hi!

I am not sure how to evaluate this correctly. Scaling databases in
general is a "known hard problem" and graph databases a sub-field of it,
which are optimized for graph-like queries as opposed to column stores
or relational databases. If you say that "throwing hardware at the
problem" does not help, you are admitting that Blazegraph does not scale
for what is needed by Wikidata. 
I think this is over-generalizing. We have a database that grew 10x over
the last 4 years. We have certain hardware and software limits, both
with existing hardware and in principle by hardware we could buy. We
also have certain issues specific to graph databases that make scaling
harder - for example, document databases, like ElasticSearch, and
certain models of relational databases, shard easily. Sharding something
like Wikidata graph is much harder, especially if the underlying
database knows nothing about specifics of Wikidata data (which would be
the case for all off-the-shelf databases). If we just randomly split the
triples between several servers, we'd probably be just modeling a large
but an extremely slow disk. So there needs to be some smarter solution,
one that we'd unlike to develop inhouse but one that has already been
verified by industry experience and other deployments.

Is the issue specific to Blazegraph and can the issue be solved by
switching platform? Maybe, we do not know yet. We do not have any better
solution that guarantees us better scalability identified, but we have a
plan on looking for that solution, given the resources. We also have a
plan on improving the throughput of Blazegraph, which we're working on now.

Non-sharding model might be hard to sustain indefinitely, but it is not
clear it can't work in the short term, and also it is not clear that
sharding model would deliver clear performance win, as it will have to
involve network latencies inside the queries, which can significantly
affect performance. This can only resolved by proper testing evaluation
of the candidate solutions.

Then it is not a "cluster" in the sense of databases. It is more a
redundancy architecture like RAID 1. Is this really how BlazeGraph does
I do not think our time here would be productively spent arguing
semantics about what should and should not be called a "cluster". We
call that setup a cluster, and I think now we all understand what we're
talking about.

it? Don't they have a proper cluster solution, where they repartition
data across servers? Or is this independent servers a wikimedia staff
homebuild?
If you mean sharded or replicated setup, as far as I know, Blazegraph
does not support that (there's some support for replication IIRC but
replication without sharding probably won't give us much improvement).
We have a plan to evaluate a solution that does shard, given necessary
resources.

Some info here:

- We evaluated some stores according to their performance:
http://www.semantic-web-journal.net/content/evaluation-metadata-representations-rdf-stores-0 
"Evaluation of Metadata Representations in RDF stores" 
Thanks for the link, it looks very interesting, I'll read it and see
which parts we could use here.

- Virtuoso has proven quite useful. I don't want to advertise here, but
the thing they have going for DBpedia uses ridiculous hardware, i.e.
64GB RAM and it is also the OS version, not the professional with
clustering and repartition capability. So we are playing the game since
ten years now: Everybody tries other databases, but then most people
come back to virtuoso. I have to admit that OpenLink is maintaining the
hosting for DBpedia themselves, so they know how to optimise. They
normally do large banks as customers with millions of write transactions
per hour. In LOD2 they also implemented column store features with
MonetDB and repartitioning in clusters.
I do not know the details of your usage scenario, so before we get into
comparisons, I'd like to understand:

1. Do your servers provide live synchronized updates with Wikdiata or
DBPedia? How many updates per second that server can process?
2. How many queries per second this server is serving? What kind of
queries are those?

We did preliminary very limited evaluation of Virtuoso for hosting
Wikidata, and it looks like it can load and host the necessary data
(though it does not support some customizations we have now and we could
not evaluate whether such customizations are possible) but it would
require significant time investment to port all the functionality to it.
Unfortunately, the lack of resources did not allow us to do fuller
evaluation.

Also, as I understand, "professional" capabilities of Virtuoso are
closed-source and require paid license, which probably would be a
problem to run it on WMF infrastructure unless we reach some kind of
special arrangement. Since this arrangement will probably not include
open-sourcing the enterprise part of Virtuoso, it should deliver a very
significant, I dare say enormous advantage for us to consider running it
in production. It may be possible that just OS version is also clearly
superior to the point that it is worth migrating, but this needs to be
established by evaluation.

- I recently heard a presentation from Arango-DB and they had a good
cluster concept as well, although I don't know anybody who tried it. The
slides seemed to make sense.
We considered AgangoDB in the past, and it turned out we couldn't use it
efficiently on the scales we need (could be our fault of course). They
also use their own proprietary language for querying, which might be
worth it if they deliver us a clear win on all other aspects, but that
does not seem to be the case.
Also, AgangoDB seems to be document database inside. This is not what
our current data model is. While it is possible to model Wikidata in
this way, again, changing the data model from RDF/SPARQL to a different
one is an enormous shift, which can only be justified by an equally
enormous improvement in some other areas, which currently is not clear.
This project seems to be still very young. While I would be very
interested if somebody took on themselves to model Wikidata in terms of
ArangoDB documents, load the whole data and see what the resulting
performance would be, I am not sure it would be wise for us to invest
our team's - very limited currently - resources into that.

Thanks,
--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org