Hi!
So I did some light benchmarking, and it looks like a single server can do 700 to 800 rps for TPF queries without significant rise in the load (which is understandable since it's almost all IO). Single request median time seems to be around 150ms and 99% time around 500ms. This quick test was done on 150 parallel threads.
I've re-run the benchmark with best-practices setting on 150 threads while randomizing the patterns I look up and it gave me over 1000 rps with average response time around 150 ms. The load was slightly higher but nowhere near the max.
So these are the parameters so far (remember that's for one server, so 3 servers ideally are supposed to do 3x of that).