Hi!
Maybe I am slightly confused here. The number of 1000
requests per
second seems to be too low if a single query leads to 100 rps, no? Or do
you mean 1000K rps?
No, it can't be 1000K rps - that would require nanosecond-grade
roundtrip times over network, which I don't think is possible. Anyway,
this was more test of how the server handles the requests, and what we
are getting from it is that single request is processed in around 100ms.
Which, given you're allowed 5 connections, gives you about 50 rps tops
per client, as I understand (at least those that hit the server, caching
is different). I'm not sure we can improve server roundtrip time by
much, given that the network is involved. Even if we somehow improve it
by order of magnitude, still you don't get to even 1K rps without
raising parallelism dramatically. Which right now we can't do, but in
the future maybe - but then there are other limitations, since handling
massively parallel connections on single server has its limits too - I'm
not sure Java servlet model is good at such things. So we shouldn't
expect super-high throughput from it I think.
Of course adding more servers will help, like it also does with
full-fledged SPARQL. But then there is no advantage compared to SPARQL.
We know that we can do 20-30 SPARQL queries per second with two servers.
That depends a lot on a kind of queries. Simple queries, yes, they
essentially no different than TPF requests - the only difference is
SPARQL parser, which doesn't take much in overall scheme of things. With
TPF, all queries are simple, with SPARQL, decidedly not so :)
--
Stas Malyshev
smalyshev(a)wikimedia.org