one machine to rule the all

List overview All Threads
Download

newer

older

Fwd: Data to improve our code...

Wikimedia Analytics @ Wikimania...

Sean Pringle

3 Apr 2014 3 Apr '14

3:24 a.m.

Hi!

In another thread Oliver asked about the progress of One Machine To Rule Them All :-)

In fact it looks like it will now be two machines to rule them all, or rather, two machines to cooperatively rule them all in roughly equal capacity. I know, it doesn't have the same ring to it...

I posted the following to RT 6383, but who knows who reads RT, so here it is again:

-- quote -- An update on this. Some Analytics folk have probably already heard bits and pieces via mailing lists, but my fellow Opsen on RT duty rightly begin to wonder about this ticket.

We have procured dbstore100[12], both with which will be replicating shards S[1-7] into a single MariaDB 10 instance each, using the new multi-source replication. The boxes are still being setup (because recombining the shards requires full dump/reload, plus getting all seven in sync, plus compressing tables -- slow going). The x1 shard and event logging will replicate to dbstore too, but that's pending RT 7081. Analytics will have direct, but read-only, access to dbstore1002.

db1047, the current s1-analytics-slave, has the required disk space so it will likely become a slave to dbstore1002, or else make use of the MariaDB 10 CONNECT engine to access the data (like federation, but better than FEDERATED engine was, thanks to ECP: engine-condition-pushdown). As ever, Analytics will have read/write access to db1047 with scratch space.

The situation will result in:

- cross joins/unions on any wiki on either db1047 or dbstore1002 - ability to spread load across both boxes with a single SQL query - less likely to block others due to locking - less likely to cause replag

I'm happy to go into more technical detail if anyone is interested.

When will it be ready, you ask? :-) Not until after the Ops meet in Athens, which realistically means: in May. -- endquote --

The bit about spreading load across two machines with one query will require people to be a bit careful in designing the SQL. Alternatively you guys might simply choose to dictate which box should run expensive queries, to avoid tripping each other up.

Incidentally, MariaDB 10 has the Cassandra storage engine which might be of some use to you guys in time. But so far I've only been trialing CONNECT+ECP.

BR Sean

-- DBA @ WMF

Attachments:

attachment.htm (text/html — 2.6 KB)

Show replies by date

Oliver Keyes

3 Apr 3 Apr

9:21 a.m.

Awesome; thanks for such a detailed update! Can you give details on the query-over-multiple machines risk and what could cause it? I have no wish to respond to people going out of their way to prevent us breaking things by, ah, breaking things ;).

I knew there was a reason I was sending a bottle of single malt to Athens.

On 2 April 2014 18:24, Sean Pringle springle@wikimedia.org wrote:

...

Hi!

In another thread Oliver asked about the progress of One Machine To Rule Them All :-)

In fact it looks like it will now be two machines to rule them all, or rather, two machines to cooperatively rule them all in roughly equal capacity. I know, it doesn't have the same ring to it...

I posted the following to RT 6383, but who knows who reads RT, so here it is again:

-- quote -- An update on this. Some Analytics folk have probably already heard bits and pieces via mailing lists, but my fellow Opsen on RT duty rightly begin to wonder about this ticket.

We have procured dbstore100[12], both with which will be replicating shards S[1-7] into a single MariaDB 10 instance each, using the new multi-source replication. The boxes are still being setup (because recombining the shards requires full dump/reload, plus getting all seven in sync, plus compressing tables -- slow going). The x1 shard and event logging will replicate to dbstore too, but that's pending RT 7081. Analytics will have direct, but read-only, access to dbstore1002.

db1047, the current s1-analytics-slave, has the required disk space so it will likely become a slave to dbstore1002, or else make use of the MariaDB 10 CONNECT engine to access the data (like federation, but better than FEDERATED engine was, thanks to ECP: engine-condition-pushdown). As ever, Analytics will have read/write access to db1047 with scratch space.

The situation will result in:

cross joins/unions on any wiki on either db1047 or dbstore1002

ability to spread load across both boxes with a single SQL query

less likely to block others due to locking

less likely to cause replag

I'm happy to go into more technical detail if anyone is interested.

When will it be ready, you ask? :-) Not until after the Ops meet in Athens, which realistically means: in May. -- endquote --

The bit about spreading load across two machines with one query will require people to be a bit careful in designing the SQL. Alternatively you guys might simply choose to dictate which box should run expensive queries, to avoid tripping each other up.

Incidentally, MariaDB 10 has the Cassandra storage engine which might be of some use to you guys in time. But so far I've only been trialing CONNECT+ECP.

BR Sean

-- DBA @ WMF

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- Oliver Keyes Research Analyst Wikimedia Foundation

Aaron Halfaker

3:49 p.m.

Thanks for the update Sean. This is exciting!

Am I understanding correctly that the two machines will be practically identical, but that db1047 will contain writeable DBs and the new machine will not? Is the purpose of this to have higher querying capacity than write capacity -- to match the hardware capabilities with our usage pattern?

-Aaron

On Thu, Apr 3, 2014 at 2:21 AM, Oliver Keyes okeyes@wikimedia.org wrote:

...

Awesome; thanks for such a detailed update! Can you give details on the query-over-multiple machines risk and what could cause it? I have no wish to respond to people going out of their way to prevent us breaking things by, ah, breaking things ;).

I knew there was a reason I was sending a bottle of single malt to Athens.

On 2 April 2014 18:24, Sean Pringle springle@wikimedia.org wrote:

...
Hi!

In another thread Oliver asked about the progress of One Machine To Rule Them All :-)

In fact it looks like it will now be two machines to rule them all, or rather, two machines to cooperatively rule them all in roughly equal capacity. I know, it doesn't have the same ring to it...

I posted the following to RT 6383, but who knows who reads RT, so here it is again:

-- quote -- An update on this. Some Analytics folk have probably already heard bits and pieces via mailing lists, but my fellow Opsen on RT duty rightly begin to wonder about this ticket.

We have procured dbstore100[12], both with which will be replicating shards S[1-7] into a single MariaDB 10 instance each, using the new multi-source replication. The boxes are still being setup (because recombining the shards requires full dump/reload, plus getting all seven in sync, plus compressing tables -- slow going). The x1 shard and event logging will replicate to dbstore too, but that's pending RT 7081. Analytics will have direct, but read-only, access to dbstore1002.

db1047, the current s1-analytics-slave, has the required disk space so it will likely become a slave to dbstore1002, or else make use of the MariaDB 10 CONNECT engine to access the data (like federation, but better than FEDERATED engine was, thanks to ECP: engine-condition-pushdown). As ever, Analytics will have read/write access to db1047 with scratch space.

The situation will result in:

cross joins/unions on any wiki on either db1047 or dbstore1002

ability to spread load across both boxes with a single SQL query

less likely to block others due to locking

less likely to cause replag

I'm happy to go into more technical detail if anyone is interested.

When will it be ready, you ask? :-) Not until after the Ops meet in Athens, which realistically means: in May. -- endquote --

The bit about spreading load across two machines with one query will require people to be a bit careful in designing the SQL. Alternatively you guys might simply choose to dictate which box should run expensive queries, to avoid tripping each other up.

Incidentally, MariaDB 10 has the Cassandra storage engine which might be of some use to you guys in time. But so far I've only been trialing CONNECT+ECP.

BR Sean

-- DBA @ WMF

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

-- Oliver Keyes Research Analyst Wikimedia Foundation

Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics

Sean Pringle

4 Apr 4 Apr

8:10 a.m.

On Thu, Apr 3, 2014 at 11:49 PM, Aaron Halfaker ahalfaker@wikimedia.orgwrote:

...

Thanks for the update Sean. This is exciting!

Am I understanding correctly that the two machines will be practically identical, but that db1047 will contain writeable DBs and the new machine will not? Is the purpose of this to have higher querying capacity than write capacity -- to match the hardware capabilities with our usage pattern?

Roughly correct interpretation, yes; but there are other differences, in hardware for example:

db1047: Dell R510, Xeon E5-640, 64g memory. dbstore1002: Dell R720, Xeon E5-2650, 128g memory.

The MariaDB setup will also differ between to two:

db1047 will be configured to support heavy and ad-hoc write activity in your scratch spaces, since I can't predict what you'll do there, and have normal read-only replicated wikis. The box will be solely for Analytics to hammer and co-exist upon :-)

dbstore1002 will be configured more like a data warehouse (of one! for now), optimized for compression on disk and in memory, high-read/low-write activity, few concurrent threads with large buffers, and minimal locking. It will, at times, be shared with other slow, read-heavy queries, such as the xml dump processors.

Custom indexes will be possible for either box (I'd suggest keeping them documented and in sync). Same for custom aggregated tables, views, or star/snowflake type stuff on the side.

Sean

Sean Pringle

8:37 a.m.

On Thu, Apr 3, 2014 at 5:21 PM, Oliver Keyes okeyes@wikimedia.org wrote:

...

Awesome; thanks for such a detailed update! Can you give details on the query-over-multiple machines risk and what could cause it? I have no wish to respond to people going out of their way to prevent us breaking things by, ah, breaking things ;).

Firstly, you'll only really risk breaking your own things; all this should be fairly insulated from the rest of the cluster. It also shouldn't require much extra thought when writing queries, except to use specific databases or tables if asked to do so :-)

I mentioned the MariaDB 10 CONNECT engine uses Engine Condition Pushdown. That's something that was developed some years ago when NDB, the MySQL Cluster engine, was being integrated, done to improve performance on the network.

The conditions in a query -- the WHERE clauses -- are sent to the storage engine by the SQL optimizer before query execution starts. If the engine supports ECP it can often pre-filter data before returning it to the SQL layer, resulting in less overhead, less data moving around, smaller temp-tables, faster sorts, etc. It's particularly useful when indexes are not necessarily optimal for ad-hoc queries.

If db1047 has a CONNECT table referencing a remote table on dbstore1002, then some or all of the conditions filtering that table will be executed on dbstore1002, while other tables and joins occur on db1047. It means shipping some data over the network, but hopefully not too much, and usually no more than any sharded/clustered/nosql solution.

CONNECT won't be a magic bullet, but it's an option that doesn't require significant redesign on any layer, and was designed for OLAP work.

Sean

3929

Age (days ago)

3930

Last active (days ago)

analytics@lists.wikimedia.org

4 comments

3 participants

tags (0)

participants (3)

Aaron Halfaker
Oliver Keyes
Sean Pringle