Re: [Wikidata] Wikidata query performance paper

6 Aug 2016

On 06-08-2016 17:56, Stas Malyshev wrote:
...
  Hi!

  On a side note, the results we presented for
BlazeGraph could improve
 dramatically if one could isolate queries that timed out. Once one query
 in a sequence timed-out (we used server-side timeouts), we observed that
 a run of queries would then timeout, possibly a locking problem or 
 Could you please give a bit more details about this failure scenario? Is
 is that several queries are run in parallel and one query, timing out,
 hurts performance of others? Does it happen even after the long query
 times out? Or it was a sequential run and after one query timed out, the
 next query had worse performance than the same query when run not
 preceded by the timing-out query, i.e. timeout query had persistent
 effect beyond its initial run? 
The latter was the case, yes. We ran the queries in a given batch 
sequentially (waiting for one to finish before the next was run) and 
when one timed out, the next would almost surely time-out and the engine 
would not recover.

We tried a few things on this, like waiting an extra 60 seconds before 
running the next query, and also changing memory configurations to avoid 
GC issues. I believe Daniel was also in contact with the devs. 
Ultimately we figured we probably couldn't resolve the issue without 
touching the source code, which would obviously not be fair.

...
  BTW, what was the timeout setting in your experiments?
I see in the
 article that it says "timeouts are counted as 60 seconds" - does it mean
 that Blazegraph had internal timeout setting set to 60 seconds, or that
 the setting was different, but when processing results, the actual run
 time was replaced by 60 seconds? 
Yup, the settings are here:

http://users.dcc.uchile.cl/~dhernand/wquery/#configure-blazegraph

My understanding is that with those settings, we set an internal timeout 
on BlazeGraph of 60 seconds.

...
  Also, did you use analytic mode for the queries?
 https://wiki.blazegraph.com/wiki/index.php/QueryEvaluation#Analytic_Query_E…
 https://wiki.blazegraph.com/wiki/index.php/AnalyticQuery

 This is the mode that is turned on automatically for the Wikidata Query
 Service, and it uses AFAIK different memory management which may
 influence how the cases you had problems with are handled. 
This I am not aware of. I would have to ask Daniel to be sure (I know he 
spent quite a lot of time playing around with different settings in the 
case of BlazeGraph).

...
  I would appreciate as much detail as you could give on
this, as this may
 also be useful on current query engine work. Also, if you're interested
 in the work done on WDQS, our experiences and the reasons for certain
 decisions and setups we did, I'd be glad to answer any questions. 
I guess to start with you should have a look at the documentation here:

http://users.dcc.uchile.cl/~dhernand/wquery/

If there's some details missing from that, or if you have any further 
questions, I can put you in contact with Daniel who did all the scripts, 
ran the experiments, was in discussion with the devs, etc. in the 
context of BlazeGraph. (I don't think he's on this list.)

I could also ask him perhaps to try create a minimal-ish test-case that 
reproduces the problem.

...
   resource leak.
Also Daniel mentioned that from discussion with the devs,
 they claim that the current implementation works best on SSD hard
 drives; our experiments were on a standard SATA. 
 Yes, we run it on SSD, judging from our tests on test servers, running
 on virtualized SATA machines, the difference is indeed dramatic (orders
 of magnitude and more for some queries). Then again, this is highly
 unscientific anecdotal evidence, we didn't make anything resembling
 formal benchmarks since the test hardware is clearly inferior to the
 production one and is meant to be so. But the point is that SSD is
 likely a must for Blazegraph to work well on this data set. Might also
 improve results for other engines, so not sure how it influences the
 comparison between the engines. 
Yes, I think this was the message we got from the mailing lists when we 
were trying to troubleshoot these issues: it would be better to use an 
SSD. But we did not have one, and of course we didn't want to tailor our 
hardware to suit one particular engine.

Unfortunately I think all such empirical experiments are in some sense 
anecdotal; even ours. We cannot deduce, for example, what would happen, 
relatively speaking, on a machine with an SSD, or more cores, or with 
multiple instances. But still, one can learn a lot from good anecdotes.

Cheers,
Aidan

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikidata] Wikidata query performance paper