[Labs-l] Database Indexing Doesn't seem to be working
Maximilian Doerr
maximilian.doerr at gmail.com
Thu Nov 13 19:04:43 UTC 2014
Thanks for the clarification, but what I don’t understand is why it’s so slow now when it used take a few seconds at most to fetch the data for this query.
Cyberpower678
English Wikipedia Account Creation Team
Mailing List Moderator
> On Nov 13, 2014, at 14:00, Brad Jorsch (Anomie) <bjorsch at wikimedia.org> wrote:
>
> On Wed, Nov 12, 2014 at 11:06 AM, Maximilian Doerr <maximilian.doerr at gmail.com <mailto:maximilian.doerr at gmail.com>> wrote:
> My knowledge of MySQL is pathetic to be honest. It’s definitely something I should familiarize myself with. Could you explain that to me so a pickle could understand it? :/
>
> Unfortunately, I don't speak pickle. But I'll try to clarify.
>
>
> "enwiki_p.revision_userindex" is a view (along with everything else in enwiki_p, I believe), which is basically a pretend table that's really populated by a SELECT query (in this case, from "enwiki.revision"). You can use "SHOW CREATE TABLE enwiki_p.revision_userindex;" to see that query if you want. So when you select from the view, it basically rewrites your query to incorporate that other select.
>> +------+-------------+----------+--------+---------------------------------------+----------------+---------+--------------------------+-------+-----------------------------+
>> | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
>> +------+-------------+----------+--------+---------------------------------------+----------------+---------+--------------------------+-------+-----------------------------+
>> | 1 | SIMPLE | revision | ref | PRIMARY,page_timestamp,user_timestamp | user_timestamp | 4 | const | 29186 | Using where; Using filesort |
>> | 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.revision.rev_page | 1 | |
> The "key" column here tells us that it is in fact using the user_timestamp index to find the rows in the revision table. Which is good, indexing isn't broken.
>
> The "Extra" column tells us a few things.
> It doesn't have "Using index", so it's having to actually fetch the row from the table instead of just getting what it needs from the index (in this case, that's because the view filters based on rev_deleted). That makes things slightly slower, especially if you're having to hit cold pages (i.e. blocks from the disk that aren't already cached in memory).
> "Using where" tells us that it might to throw away some of the rows that it fetched, thanks to a WHERE clause (again, the rev_deleted check).
> "Using filesort" means that it's going to load all the rows into memory (or worse, write them to a temp file) and then sort them to put them in ORDER BY order. This can get really slow if there are a lot of them. I don't know why it's deciding it needs to do this here, since the user_timestamp index should already be giving the rows in the asked-for order.
> And then, of course, it's having to fetch rows from the page table as well, which could again be hitting cold pages.
>
>
> --
> Brad Jorsch (Anomie)
> Software Engineer
> Wikimedia Foundation
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20141113/6d250c4f/attachment.html>
More information about the Labs-l
mailing list