DBMS where join+limit works?

List overview All Threads
Download

newer

older

Browser Percentages for...

SVN commit access

Tim Starling

27 Feb 2009 27 Feb '09

10:29 a.m.

On thistle with DB=dewiki:

mysql> explain select * from recentchanges left join tag_summary on ts_rc_id=rc_id order by rc_timestamp desc limit 50\G

*************************** 1. row *************************** table: recentchanges type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1179921 Extra: Using temporary; Using filesort *************************** 2. row *************************** table: tag_summary type: ALL possible_keys: ts_rc_id key: NULL key_len: NULL ref: NULL rows: 4 Extra: 2 rows in set (0.00 sec)

Whenever you do a join with a limit, MySQL gets the query plan wrong. It scans the small table and filesorts the large table. You have to use FORCE INDEX on the small table to suppress the scan. We've seen this many times. It's very difficult to detect during code review and frequently crashes the site.

Does anyone know a DBMS where joining with limits actually works? Because I'm sick of this crap.

-- Tim Starling

Show replies by date

Nikola Smolenski

27 Feb 27 Feb

1:35 p.m.

Tim Starling wrote:

...

Whenever you do a join with a limit, MySQL gets the query plan wrong. It scans the small table and filesorts the large table. You have to use FORCE INDEX on the small table to suppress the scan. We've seen this many times. It's very difficult to detect during code review and frequently crashes the site.

Does anyone know a DBMS where joining with limits actually works? Because I'm sick of this crap.

Would it help to do it in a way similar to what you have to do in DBMSes without limit:

select a.* from ( select * from recentchanges left join tag_summary on ts_rc_id=rc_id order by rc_timestamp desc ) a limit 50

or is MySQL's parser too smart for its own good?

Domas Mituzas

5:23 p.m.

Hi!

...

Whenever you do a join with a limit, MySQL gets the query plan wrong.

"Wherever you do a unconstrained join with a limit, MySQL may get the query plan wrong". Anyway, this exact case looks like a bug that was fixed later.

...

It scans the small table and filesorts the large table.

It actually makes sense when small table refers to subset of large table. I've discussed this with optimizer team quite a few times, it is one of gray areas where having better statistics within engine help.

...

Does anyone know a DBMS where joining with limits actually works? Because I'm sick of this crap.

Every database in the end will want you to provide hints in one way or another. Indexing itself is already a hint :)

Anyway, as you may note, we were up for quite a while, just developers end up thinking that "oh, database is too stupid to do X or to do Y".

Every static decision can have its own problems, and this is where you end up having humans to do the work. We don't employ software to write all the code for us? Maybe in similar ways we cannot always be sure, that software which does all the data management for us (how many developers do actually think about indexing and data fetch efficiency?), sometimes may go into wrong decisions?

(And yes, maybe it is fixed in later versions, along with other problems fixed, and other problems introduced).

...

Does anyone know a DBMS where joining with limits actually works?

*shrug*, maybe PG does it properly, maybe MySQL 5.x does it properly, maybe sqlite does it properly.

...

Because I'm sick of this crap.

Well, FORCE INDEX used to be where it was for a reason, whatever the reason was. Removing it was problem too ;-)

Anyway, one can be sick of lots of crap, I'm sick of fact that 1% of our requests that get to backend (which is 0.1% of requests that are done to the site) make 50% of site CPU load. The simple fact is, that we had forced indexes where needed, but site still has 0.1% of user requests causing 50% of CPU load.

Cheers,

-- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]

Tim Landscheidt

26 Mar 26 Mar

5:34 p.m.

Domas Mituzas midom.lists@gmail.com wrote:

...

[...]

...
Does anyone know a DBMS where joining with limits actually works?

...

*shrug*, maybe PG does it properly, maybe MySQL 5.x does it properly, maybe sqlite does it properly. [...]

BTW, is it feasible to replicate from MySQL to PostgreSQL (or SQLite :-)) in a continuous way so that one or two cuckoo's eggs could be planted in the database farm? Though probably undesirable from an operations viewpoint, a real- time performance benchmark would be really cool :-).

Tim

Platonides

27 Mar 27 Mar

3:01 a.m.

Tim Landscheidt wrote:

...

BTW, is it feasible to replicate from MySQL to PostgreSQL (or SQLite :-)) in a continuous way so that one or two cuckoo's eggs could be planted in the database farm? Though probably undesirable from an operations viewpoint, a real- time performance benchmark would be really cool :-).

Tim

The SQL syntax varies between dbs so I don't think it's feasible. Waht _could_ be would be making MW update several DBs at the same time. I'm sure db in consitencies will appear, though.

Domas Mituzas

31 Mar 31 Mar

2:12 a.m.

Hi,

...

BTW, is it feasible to replicate from MySQL to PostgreSQL (or SQLite :-)) in a continuous way so that one or two cuckoo's eggs could be planted in the database farm?

in theory, MySQL's 5.1 replication sends only row changes, so that can be used to replicate to different stores.

...

Though probably undesirable from an operations viewpoint, a real- time performance benchmark would be really cool :-).

Yes and yes :)

-- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]

Greg Sabino Mullane

27 Feb 27 Feb

6:36 p.m.

...

Does anyone know a DBMS where joining with limits actually works? Because I'm sick of this crap.

FWIW, this is what Postgres gives:

mw=# explain select * from recentchanges left join tag_summary on ts_rc_id=rc_id order by rc_timestamp desc limit 50;

QUERY PLAN ------------------------------------------------------------ Limit (cost=0.00..31.28 rows=50) -> Nested Loop Left Join (cost=0.00..475208.34 rows=2940914) -> Index Scan Backward using rc_timestamp on recentchanges (cost=0.00..45650.46 rows=1020612) -> Index Scan using tag_summary_rc_id on tag_summary (cost=0.00..0.37 rows=4) Index Cond: (tag_summary.ts_rc_id = recentchanges.rc_id)

EXPLAIN ANALYZE with about a million (bogus) rows in each table:

mw=# explain analyze select * from recentchanges left join tag_summary on ts_rc_id=rc_id order by rc_timestamp desc limit 50; QUERY PLAN ------------------------------------------------------------------------- Limit (cost=0.00..31.28 rows=50) (actual time=0.147..0.415 rows=50 loops=1) -> Nested Loop Left Join (cost=0.00..475208.34 rows=2940914) (actual time=0.146..0.372 rows=50 loops=1) -> Index Scan Backward using rc_timestamp on recentchanges (cost=0.00..45650.46 rows=1020612) (actual time=0.110..0.122 rows=4 loops=1) -> Index Scan using tag_summary_rc_id on tag_summary (cost=0.00..0.47 rows=4) (actual time=0.014..0.043 rows=12 loops=4) Index Cond: (tag_summary.ts_rc_id = recentchanges.rc_id) Total runtime: 0.559 ms

-- Greg Sabino Mullane greg@endpoint.com End Point Corporation PGP Key: 0x14964AC8

Aryeh Gregor

13 Apr 13 Apr

11:28 p.m.

On Fri, Feb 27, 2009 at 2:29 AM, Tim Starling tstarling@wikimedia.org wrote:

...

On thistle with DB=dewiki:

mysql> explain select * from recentchanges left join tag_summary on ts_rc_id=rc_id order by rc_timestamp desc limit 50\G

*************************** 1. row *************************** table: recentchanges type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1179921 Extra: Using temporary; Using filesort *************************** 2. row *************************** table: tag_summary type: ALL possible_keys: ts_rc_id key: NULL key_len: NULL ref: NULL rows: 4 Extra: 2 rows in set (0.00 sec)

Almost two months later, but now that I have access to the toolsever DB, here's some output from zedler on dewiki:

mysql> SELECT VERSION(); +-----------+ | VERSION() | +-----------+ | 5.1.33 | +-----------+ 1 row in set (0.00 sec)

mysql> explain select * from recentchanges left join tag_summary on ts_rc_id=rc_id order by rc_timestamp desc limit 50\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: recentchanges type: index possible_keys: NULL key: rc_timestamp key_len: 16 ref: NULL rows: 50 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: tag_summary type: ref possible_keys: ts_rc_id key: ts_rc_id key_len: 5 ref: dewiki.recentchanges.rc_id rows: 1 Extra: 2 rows in set (0.00 sec)

So "MySQL 5.1" is a valid answer. (EXPLAIN even understands LIMIT now, yay!)

5746

Age (days ago)

5791

Last active (days ago)

wikitech-l@lists.wikimedia.org

7 comments

7 participants

tags (0)

participants (7)

Aryeh Gregor
Domas Mituzas
Greg Sabino Mullane
Nikola Smolenski
Platonides
Tim Landscheidt
Tim Starling