I have recently encountered this text in which the author claims very high MySQL speedups for simple queries (7.5 times faster than MySQL, twice faster than memcached) by reading the data directly from InnoDB where possible (MySQL is still used for writing and for complex queries.) Knowing that faster DB is always good, I thought this would be an interesting thing to consider :)
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-fo...
Nikola Smolenski wrote:
I have recently encountered this text in which the author claims very high MySQL speedups for simple queries (7.5 times faster than MySQL, twice faster than memcached) by reading the data directly from InnoDB where possible (MySQL is still used for writing and for complex queries.) Knowing that faster DB is always good, I thought this would be an interesting thing to consider :)
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-fo...
It looks interesting. There are some places where mediawiki could take that shortcut if available. I wonder if we have such CPU bottleneck, though.
On Thu, Dec 23, 2010 at 5:19 PM, Platonides Platonides@gmail.com wrote:
Nikola Smolenski wrote:
I have recently encountered this text in which the author claims very high MySQL speedups for simple queries (7.5 times faster than MySQL, twice faster than memcached) by reading the data directly from InnoDB where possible (MySQL is still used for writing and for complex queries.) Knowing that faster DB is always good, I thought this would be an interesting thing to consider :)
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-fo...
It looks interesting. There are some places where mediawiki could take that shortcut if available. I wonder if we have such CPU bottleneck, though.
I was afraid it would be a local-DB-only hack, but on reading it they implemented a full network service (text based, telnet as a minimal interface, bueno for sysadmins/engineers and debugging) inside MySQL's main daemon software, which was pretty darn cool of them.
Even if this wasn't ultimately relevant for MediaWiki or Wikimedia projects, thanks for posting that to the list, Nikola. I do systems architecture involving DBs a lot, and I could well end up recommending something like this for a customer application.
On Thu, Dec 23, 2010 at 9:34 AM, Nikola Smolenski smolensk@eunet.rs wrote:
I have recently encountered this text in which the author claims very high MySQL speedups for simple queries (7.5 times faster than MySQL, twice faster than memcached) by reading the data directly from InnoDB where possible (MySQL is still used for writing and for complex queries.) Knowing that faster DB is always good, I thought this would be an interesting thing to consider :)
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-fo...
It's easy to get fast results if you don't care about your reads being atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL.
(*) Among other possibilities, just use MyISAM.
Hi!
A:
It's easy to get fast results if you don't care about your reads being atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL.
MySQL upper layers know nothing much about transactions, it is all engine-specific - BEGIN and COMMIT processing is deferred to table handlers. It would incredibly easy for them to implement repeatable read snapshots :) (if thats what you mean by atomic read)
(*) Among other possibilities, just use MyISAM.
How is that applicable to any discussion?
Domas
Hi,
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Domas Mituzas Sent: 24 December 2010 09:09 To: Wikimedia developers Subject: Re: [Wikitech-l] Using MySQL as a NoSQL
Hi!
A:
It's easy to get fast results if you don't care about your
reads being
atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL.
MySQL upper layers know nothing much about transactions, it is all engine-specific - BEGIN and COMMIT processing is deferred to table handlers. It would incredibly easy for them to implement repeatable read snapshots :) (if thats what you mean by atomic read)
It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket. So if you update/insert/delete via HandlerSocket, then query via SQL your not guarenteed to see the changes unless you use SQL_NO_CACHE.
(*) Among other possibilities, just use MyISAM.
How is that applicable to any discussion?
Domas _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Jared
Hi!
It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket.
On busy systems (I assume we talk about busy systems, as discussion is about HS) query cache is usually eliminated anyway. Either by compiling it out, or by patching the code not to use qcache mutexes unless it really really is enabled. In worst case, it is just simply disabled. :)
So if you update/insert/delete via HandlerSocket, then query via SQL your not guarenteed to see the changes unless you use SQL_NO_CACHE.
You are probably right. Again, nobody cares about qcache at those performance boundaries.
Domas
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Domas Mituzas Sent: 24 December 2010 13:42 To: Wikimedia developers Subject: Re: [Wikitech-l] Using MySQL as a NoSQL
Hi!
It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket.
On busy systems (I assume we talk about busy systems, as discussion is about HS) query cache is usually eliminated anyway. Either by compiling it out, or by patching the code not to use qcache mutexes unless it really really is enabled. In worst case, it is just simply disabled. :)
So if you update/insert/delete via HandlerSocket, then
query via SQL
your not guarenteed to see the changes unless you use
SQL_NO_CACHE.
You are probably right. Again, nobody cares about qcache at those performance boundaries.
Domas
Ah, interesting. The only reason I took at it was because you don't have to pfaff with encoding/escaping values* the way you have to SQL. SQL injection vulnerabilities don't exist.
* And the protocol handles binary values which normally have to pfaff about getting in and out of MySQL with the various PHP apis.
Does seem a bit specialised, could have a persistent cache, maybe as a session handler.
Jared
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Jared Williams Sent: 24 December 2010 16:18 To: 'Wikimedia developers' Subject: Re: [Wikitech-l] Using MySQL as a NoSQL
-----Original Message----- From: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] On Behalf Of Domas
Mituzas Sent: 24 December 2010 13:42 To: Wikimedia developers Subject: Re: [Wikitech-l] Using MySQL as a NoSQL
Hi!
It seems from my tinkering that MySQL query cache handling is circumvented via HandlerSocket.
On busy systems (I assume we talk about busy systems, as
discussion is
about HS) query cache is usually eliminated anyway. Either by compiling it out, or by patching the code not to
use qcache
mutexes unless it really really is enabled. In worst case,
it is just
simply disabled. :)
So if you update/insert/delete via HandlerSocket, then
query via SQL
your not guarenteed to see the changes unless you use
SQL_NO_CACHE.
You are probably right. Again, nobody cares about qcache at those performance boundaries.
Domas
Ah, interesting. The only reason I took at it was because you don't have to pfaff with encoding/escaping values* the way you have to SQL. SQL injection vulnerabilities don't exist.
- And the protocol handles binary values which normally have
to pfaff about getting in and out of MySQL with the various PHP
apis.
Does seem a bit specialised, could have a persistent cache, maybe as a session handler.
Maybe a session handler even.
Jared
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Fri, Dec 24, 2010 at 4:08 AM, Domas Mituzas midom.lists@gmail.com wrote:
Hi!
A:
It's easy to get fast results if you don't care about your reads being atomic (*), and I find it hard to believe they've managed to get atomic reads without going through MySQL.
MySQL upper layers know nothing much about transactions, it is all engine-specific - BEGIN and COMMIT processing is deferred to table handlers. It would incredibly easy for them to implement repeatable read snapshots :) (if thats what you mean by atomic read)
I suppose it's possible in theory, but in any case, it's not what they're doing. They *are* going through MySQL, via the HandlerSocket plugin.
I wonder if they'd get much different performance by just using prepared statements and read committed isolation, with the transactions spanning multiple requests. The tables would only get locked once per transaction, right?
Or do I just have no idea what I'm talking about?
(*) Among other possibilities, just use MyISAM.
How is that applicable to any discussion?
It was an example of a way to get fast results if you don't care about your reads being atomic.
Hi!
I suppose it's possible in theory, but in any case, it's not what they're doing. They *are* going through MySQL, via the HandlerSocket plugin.
After reading the code I can sure correct myself - they are calling into some of MySQL things (e.g. for table open), but everything else is table handler interfaces :-) Though I guess what I mean "going through MySQL" and you have in mind are entirely different topics :)
I wonder if they'd get much different performance by just using prepared statements and read committed isolation, with the transactions spanning multiple requests. The tables would only get locked once per transaction, right?
They wouldn't gain too much of perf with PS, transaction establishment is quite cheap in MySQL, especially compared to other vendors. Do note, prepared statements don't prepare query plan for you - it is reestablished at every execution, neither it does have open table handlers, IIRC - generally you just don't have to reparse the text.
It was an example of a way to get fast results if you don't care about your reads being atomic.
InnoDB is faster than MyISAM at high performance workloads.
Domas
Hi!
I have recently encountered this text in which the author claims very high MySQL speedups for simple queries
It is not that he speeds up simple queries (you'd notice that maybe if you used infiniband, and even then it wouldn't matter much :) He just avoided hitting some expensive critical sections that make scaling on multicore systems problematic.
It looks interesting. There are some places where mediawiki could take that shortcut if available.
It wouldn't be a shortcut if you had to establish another database connection besides existing one.
I wonder if we have such CPU bottleneck, though.
No, not really. Our average (do note, this isn't median and is affected by heavy queries more) DB response time is 1.3ms (measuring on the client).
Domas
On 12/24/2010 10:01 AM, Domas Mituzas wrote:
I wonder if we have such CPU bottleneck, though.
No, not really. Our average (do note, this isn't median and is affected by heavy queries more) DB response time is 1.3ms (measuring on the client).
This could also reduce memory usage by not using memcached (as often) which, I understand, is a bigger problem.
Hi!
This could also reduce memory usage by not using memcached (as often) which, I understand, is a bigger problem.
No it is not.
First of all, our memcached and database access times not that far away - 0.7 vs 1.3 ms (again, memcached is static response time, whereas database average is impacted by calculations). On another hand, we don't store in memcached what is stored in database and we don't store in database what is stored in memcached.
Think about these as two separate systems, not as complementing each other too much. We use memcached to offload application cluster, not database cluster.
And database cluster already has over a terabyte of RAM (replicas and whatnot), whereas our memcached lives in puny 158GB arena.
I described some of fundamental differences of how we use memcached in http://dom.as/uc/workbook2007.pdf - pages 11-13. Nothing much changed since then.
Domas
wikitech-l@lists.wikimedia.org