Hi Stas,
thanks for the elaboration. I can understand the background much better. I have to admit, that I am also not a real expert, but very close to the real experts like Vidal and Rahm who are co-authors of the SWJ paper or the OpenLink devs.
I am also spoiled, because OpenLink solves the hosting for DBpedia and also DBpedia-live with ca. 130k updates per day for the English Wikipedia. I think, this one is the recent report: https://medium.com/virtuoso-blog/dbpedia-usage-report-as-of-2018-01-01-8cae1... Then again DBpedia didn't grow for a while, but we made a "Best of" now [1]. But will not host it all.
[1] https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf
I also see that your context is difficult. Maybe you can custom shard/scaleout Blazegraph based on the queries and then replicate the sharded clusters. Like a mix between sharding and replication, maybe just 3 * 3 servers instead of 9 replicated ones or 9 servers full of shards. Not so many options here, if you have the OS requirement. I guess you are caching static content already as much as possible.
This also seems pretty much what I know, but it really is all second hand as my expertise is more focused on what's inside the database.
All the best,
Sebastian
On 10.06.19 22:02, Stas Malyshev wrote:
Hi!
I am not sure how to evaluate this correctly. Scaling databases in general is a "known hard problem" and graph databases a sub-field of it, which are optimized for graph-like queries as opposed to column stores or relational databases. If you say that "throwing hardware at the problem" does not help, you are admitting that Blazegraph does not scale for what is needed by Wikidata.
I think this is over-generalizing. We have a database that grew 10x over the last 4 years. We have certain hardware and software limits, both with existing hardware and in principle by hardware we could buy. We also have certain issues specific to graph databases that make scaling harder - for example, document databases, like ElasticSearch, and certain models of relational databases, shard easily. Sharding something like Wikidata graph is much harder, especially if the underlying database knows nothing about specifics of Wikidata data (which would be the case for all off-the-shelf databases). If we just randomly split the triples between several servers, we'd probably be just modeling a large but an extremely slow disk. So there needs to be some smarter solution, one that we'd unlike to develop inhouse but one that has already been verified by industry experience and other deployments.
Is the issue specific to Blazegraph and can the issue be solved by switching platform? Maybe, we do not know yet. We do not have any better solution that guarantees us better scalability identified, but we have a plan on looking for that solution, given the resources. We also have a plan on improving the throughput of Blazegraph, which we're working on now.
Non-sharding model might be hard to sustain indefinitely, but it is not clear it can't work in the short term, and also it is not clear that sharding model would deliver clear performance win, as it will have to involve network latencies inside the queries, which can significantly affect performance. This can only resolved by proper testing evaluation of the candidate solutions.
Then it is not a "cluster" in the sense of databases. It is more a redundancy architecture like RAID 1. Is this really how BlazeGraph does
I do not think our time here would be productively spent arguing semantics about what should and should not be called a "cluster". We call that setup a cluster, and I think now we all understand what we're talking about.
it? Don't they have a proper cluster solution, where they repartition data across servers? Or is this independent servers a wikimedia staff homebuild?
If you mean sharded or replicated setup, as far as I know, Blazegraph does not support that (there's some support for replication IIRC but replication without sharding probably won't give us much improvement). We have a plan to evaluate a solution that does shard, given necessary resources.
Some info here:
- We evaluated some stores according to their performance:
http://www.semantic-web-journal.net/content/evaluation-metadata-representati... "Evaluation of Metadata Representations in RDF stores"
Thanks for the link, it looks very interesting, I'll read it and see which parts we could use here.
- Virtuoso has proven quite useful. I don't want to advertise here, but
the thing they have going for DBpedia uses ridiculous hardware, i.e. 64GB RAM and it is also the OS version, not the professional with clustering and repartition capability. So we are playing the game since ten years now: Everybody tries other databases, but then most people come back to virtuoso. I have to admit that OpenLink is maintaining the hosting for DBpedia themselves, so they know how to optimise. They normally do large banks as customers with millions of write transactions per hour. In LOD2 they also implemented column store features with MonetDB and repartitioning in clusters.
I do not know the details of your usage scenario, so before we get into comparisons, I'd like to understand:
- Do your servers provide live synchronized updates with Wikdiata or
DBPedia? How many updates per second that server can process? 2. How many queries per second this server is serving? What kind of queries are those?
We did preliminary very limited evaluation of Virtuoso for hosting Wikidata, and it looks like it can load and host the necessary data (though it does not support some customizations we have now and we could not evaluate whether such customizations are possible) but it would require significant time investment to port all the functionality to it. Unfortunately, the lack of resources did not allow us to do fuller evaluation.
Also, as I understand, "professional" capabilities of Virtuoso are closed-source and require paid license, which probably would be a problem to run it on WMF infrastructure unless we reach some kind of special arrangement. Since this arrangement will probably not include open-sourcing the enterprise part of Virtuoso, it should deliver a very significant, I dare say enormous advantage for us to consider running it in production. It may be possible that just OS version is also clearly superior to the point that it is worth migrating, but this needs to be established by evaluation.
- I recently heard a presentation from Arango-DB and they had a good
cluster concept as well, although I don't know anybody who tried it. The slides seemed to make sense.
We considered AgangoDB in the past, and it turned out we couldn't use it efficiently on the scales we need (could be our fault of course). They also use their own proprietary language for querying, which might be worth it if they deliver us a clear win on all other aspects, but that does not seem to be the case. Also, AgangoDB seems to be document database inside. This is not what our current data model is. While it is possible to model Wikidata in this way, again, changing the data model from RDF/SPARQL to a different one is an enormous shift, which can only be justified by an equally enormous improvement in some other areas, which currently is not clear. This project seems to be still very young. While I would be very interested if somebody took on themselves to model Wikidata in terms of ArangoDB documents, load the whole data and see what the resulting performance would be, I am not sure it would be wise for us to invest our team's - very limited currently - resources into that.
Thanks,