On Sun, Feb 02, 2003 at 11:49:05PM -0800, Brion Vibber wrote:
I've just put in a hacky manual sort (in CVS) for the page history; all rows are returned without sorting (so in ID order, usually but not always chronological) and then output in reverse order. It's *immensely* faster. Try loading up the history on the Village Pump or Votes for Deletion, and watch it actually come up! Kinda nice. :)
I also took the opportunity to drop in the see next X links, so you don't have to load all 2500+ items at once on your (well, my) 56k modem. (These are used pretty generally, and the code really ought to be moved out of SearchEngine.php and into globalfunctions...)
Of course, it would be *nicer* to do the ORDER BY and LIMIT in mysql.
The old query: EXPLAIN SELECT old_id,old_namespace,old_title,old_user, -> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old -> WHERE old_namespace=4 AND -> old_title='Village_pump' -> ORDER BY old_timestamp DESC; +-------+------+---------------+---------------+---------+-------------+------+----------------------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+---------------+---------+-------------+------+----------------------------+ | old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where used; Using filesort | +-------+------+---------------+---------------+---------+-------------+------+----------------------------+
The new query: EXPLAIN SELECT old_id,old_namespace,old_title,old_user, -> old_comment,old_user_text,old_timestamp,old_minor_edit FROM old -> WHERE old_namespace=4 AND -> old_title='Village_pump'; +-------+------+---------------+---------------+---------+-------------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+---------------+---------+-------------+------+------------+ | old | ref | old_namespace | old_namespace | 256 | const,const | 1900 | where used | +-------+------+---------------+---------------+---------+-------------+------+------------+
Now, in an ideal world, it could do the sorting based on that handy index it has on old_timestamp. In the topsy-turvy world of MySQL, however, only the index used for the WHERE clause means anything. Other ways to speed it up?
Actually, that is not MySQL's fault but just a consequence of how indices work; the only way in which an index can help with sorting is that it in some sense presorts the records for you, but that is irrelevant if you also use other indices (as you do here) because they will order the records in a different way.
Anyway, there is actually a "trick" that you can use here to get always the chronological order and that is to define an in index on (old_namespace, old_title, old_timestamp) and forcing MySQL to use that index as in (assuming that this indes is named old_namespace_2):
SELECT old_id, old_timestamp FROM old USE INDEX (old_namespace_2) WHERE old_namespace=0 AND old_title='Village_pump' ORDER BY old_timestamp;
then EXPLAIN will give you:
+-------+------+-------------------------------+-----------------+---------+-------------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+-------------------------------+-----------------+---------+-------------+------+------------+ | old | ref | old_namespace,old_namespace_2 | old_namespace_2 | 256 | const,const | 29 | where used | +-------+------+-------------------------------+-----------------+---------+-------------+------+------------+
As you can see there is no sorting and that is because it uses the fact that the index already presorts the records for you. Because MySQL knows that this index can also be used as an index on (namespace, title) and even on (namespace) you can drop the indices on those combinations. I'm not sure if you can actually omit the USE but I know that if the other index is still around on (namespace, title) it will usually pick that index, and then it still has to sort.
You can use this method in most cases where we sort on timestamp, and you might even consider storing the timestamps negated (unfortunately MySQL doesn't have descending indices) in an extra redundant column and use that column for the index. Then you wouldn't even have to reverse the list and could efficiently do a query that selects that last 100 or so without reading the full list into memory.
-- Jan Hidders