On Sun, Feb 02, 2003 at 11:49:05PM -0800, Brion Vibber wrote:
I've just put in a hacky manual sort (in CVS) for the page history; all
rows are returned without sorting (so in ID order, usually but not
always chronological) and then output in reverse order. It's *immensely*
faster. Try loading up the history on the Village Pump or Votes for
Deletion, and watch it actually come up! Kinda nice. :)
I also took the opportunity to drop in the see next X links, so you
don't have to load all 2500+ items at once on your (well, my) 56k modem.
(These are used pretty generally, and the code really ought to be moved
out of SearchEngine.php and into globalfunctions...)
Of course, it would be *nicer* to do the ORDER BY and LIMIT in mysql.
The old query:
EXPLAIN SELECT old_id,old_namespace,old_title,old_user,
-> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old
-> WHERE old_namespace=4 AND
-> old_title='Village_pump'
-> ORDER BY old_timestamp DESC;
+-------+------+---------------+---------------+---------+-------------+------+----------------------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra
|
+-------+------+---------------+---------------+---------+-------------+------+----------------------------+
| old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where
used; Using filesort |
+-------+------+---------------+---------------+---------+-------------+------+----------------------------+
The new query:
EXPLAIN SELECT old_id,old_namespace,old_title,old_user,
-> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old
-> WHERE old_namespace=4 AND
-> old_title='Village_pump';
+-------+------+---------------+---------------+---------+-------------+------+------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra
|
+-------+------+---------------+---------------+---------+-------------+------+------------+
| old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where
used |
+-------+------+---------------+---------------+---------+-------------+------+------------+
Now, in an ideal world, it could do the sorting based on that handy
index it has on old_timestamp. In the topsy-turvy world of MySQL,
however, only the index used for the WHERE clause means anything. Other
ways to speed it up?
Actually, that is not MySQL's fault but just a consequence of how indices
work; the only way in which an index can help with sorting is that it in
some sense presorts the records for you, but that is irrelevant if you also
use other indices (as you do here) because they will order the records in a
different way.
Anyway, there is actually a "trick" that you can use here to get always the
chronological order and that is to define an in index on (old_namespace,
old_title, old_timestamp) and forcing MySQL to use that index as in
(assuming that this indes is named old_namespace_2):
SELECT old_id, old_timestamp
FROM old
USE INDEX (old_namespace_2)
WHERE old_namespace=0 AND old_title='Village_pump'
ORDER BY old_timestamp;
then EXPLAIN will give you:
+-------+------+-------------------------------+-----------------+---------+-------------+------+------------+
| table | type | possible_keys | key | key_len | ref |
rows | Extra |
+-------+------+-------------------------------+-----------------+---------+-------------+------+------------+
| old | ref | old_namespace,old_namespace_2 | old_namespace_2 | 256 | const,const |
29 | where used |
+-------+------+-------------------------------+-----------------+---------+-------------+------+------------+
As you can see there is no sorting and that is because it uses the fact that
the index already presorts the records for you. Because MySQL knows that
this index can also be used as an index on (namespace, title) and even on
(namespace) you can drop the indices on those combinations. I'm not sure if
you can actually omit the USE but I know that if the other index is still
around on (namespace, title) it will usually pick that index, and then it
still has to sort.
You can use this method in most cases where we sort on timestamp, and you
might even consider storing the timestamps negated (unfortunately MySQL
doesn't have descending indices) in an extra redundant column and use that
column for the index. Then you wouldn't even have to reverse the list and
could efficiently do a query that selects that last 100 or so without
reading the full list into memory.
-- Jan Hidders