Re: [Wikidata-tech] Wikidata Query Backend Update (take two!)

6 Mar 2015

On Thu, Mar 5, 2015 at 6:47 PM, Thad Guidry &lt;thadguidry(a)gmail.com&gt; wrote:

...
  Nik,

 Will you be incorporating MapGraph, as well, with GPU hardware as part of
 the scope of the Wikidata Query Service ?  Or is that out of scope until
 you know what the load limits will be and just use BlazeGraph as is with
 CPU-bound memory ?

 MapGraph isn't open source so we won't be using it. 

...
  What are the scalability plans for also using MapGraph
with GPU's and
 their memory in the future, in case the need for faster graph traversal
 arises ?

 So MapGraph is out but otherwise scalability plans are pretty standard
stuff:
1.  Instrument for slow stuff
2.  Fix bugs that make it slow
3.  Buy more servers to scale out when #2 gets too slow to keep up

These servers would just be replicas.  This fails when the working set
grows too large and that is something we'll be watching out for.
BlazeGraph has some horizontal scaling features that we'll invoke if we get
there.

Furthermore this'll all be easonably easy to run outside of the cluster so
if folks need to take it locally and do things with it that we can't (like
MapGraph) then it should work well.

I'm certainly weary of Java.  I've worked in Java for years and I'm really
familiar with all of its baggage.  BlazeGraph does a very reasonable job
with it.  It feels like half of the graph databases are written in Java and
I've always wondered why.  Locking down the SPARQL endpoint so its
"impossible" to overwhelm the system is high on our list of things to do
and Java makes that harder.  BlazeGraph's analytic query mode should help
there.  Ultimately I see the JVM as a risk to mitigate in this case.

Nik

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Wikidata-tech] Wikidata Query Backend Update (take two!)