It's the sorts on some of our queries that are so damn slow. A few hundred or a couple thousand rows can take *a couple minutes* for it to sort on the timestamp field.
I've just put in a hacky manual sort (in CVS) for the page history; all rows are returned without sorting (so in ID order, usually but not always chronological) and then output in reverse order. It's *immensely* faster. Try loading up the history on the Village Pump or Votes for Deletion, and watch it actually come up! Kinda nice. :)
I also took the opportunity to drop in the see next X links, so you don't have to load all 2500+ items at once on your (well, my) 56k modem. (These are used pretty generally, and the code really ought to be moved out of SearchEngine.php and into globalfunctions...)
Of course, it would be *nicer* to do the ORDER BY and LIMIT in mysql.
The old query: EXPLAIN SELECT old_id,old_namespace,old_title,old_user, -> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old -> WHERE old_namespace=4 AND -> old_title='Village_pump' -> ORDER BY old_timestamp DESC; +-------+------+---------------+---------------+---------+-------------+------+----------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+---------------+---------+-------------+------+----------------------------+ | old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where used; Using filesort | +-------+------+---------------+---------------+---------+-------------+------+----------------------------+
The new query: EXPLAIN SELECT old_id,old_namespace,old_title,old_user, -> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old -> WHERE old_namespace=4 AND -> old_title='Village_pump'; +-------+------+---------------+---------------+---------+-------------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+---------------+---------+-------------+------+------------+ | old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where used | +-------+------+---------------+---------------+---------+-------------+------+------------+
Now, in an ideal world, it could do the sorting based on that handy index it has on old_timestamp. In the topsy-turvy world of MySQL, however, only the index used for the WHERE clause means anything. Other ways to speed it up?
http://www.mysql.com/doc/en/ORDER_BY_optimisation.html
Oh, yes, and I changed the index from old_title to a combo on old_namespace and old_title. This should cut down the number of rows a little bit where a page and its talk page are both long; we never WHERE the two columns separately on the old table, only together.
SHOW INDEX FROM old; +-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Comment | +-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+---------+ | old | 0 | old_id | 1 | old_id | A | 744949 | NULL | NULL | | | old | 1 | old_timestamp | 1 | old_timestamp | A | 744949 | NULL | NULL | | | old | 1 | old_user | 1 | old_user | A | 14 | NULL | NULL | | | old | 1 | old_user_text | 1 | old_user_text | A | 74494 | NULL | NULL | | | old | 1 | old_namespace | 1 | old_namespace | A | 14 | NULL | NULL | | | old | 1 | old_namespace | 2 | old_title | A | 148989 | NULL | NULL | | +-------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+---------+
And, since I was torturing things with slowness anyway, old is now InnoDB. Whoop.
-- brion vibber (brion @ pobox.com)