[Labs-l] Query optimization

John phoenixoverride at gmail.com
Thu Mar 10 11:48:12 UTC 2016


Take a look at:

 select * from category left join page on page_title = cat_title where
page_namespace = 14 and cat_pages > 500 and 'fa' NOT IN (SELECT ll_lang
FROM langlinks where ll_from = page_id)

It works very quick and gives what you need.


On Thu, Mar 10, 2016 at 4:09 AM, Jaime Crespo <jcrespo at wikimedia.org> wrote:

> On Thu, Mar 10, 2016 at 9:24 AM, Purodha Blissenbach
> <purodha at blissenbach.org> wrote:
> > I do not know how replication works, but I did and do assume, it
> > replicates indexes. If so, the online manual starting at
> > https://www.mediawiki.org/wiki/Manual:Database_layout
> > may help you.
>
> Correct. There are some cases where the indexing doesn't work (it
> doesn't make sense to use certain indexes that are partially filtered
> on the view). In those cases, new views are created, such as
> "revision_userindex", etc.
>
> Maybe we can add a "columns" table to information_schema_p, or the
> create tables to the "tables" table, where the original table
> description is explained. File a ticket if that would be useful.
>
> If there are several users that perform common slow analytics, file a
> ticket and we could, if technically possible, do them in advance so
> they do not have to be done every single time. Another option is to
> replicate to a column-based engine, but I am not sure if people would
> know how to take most advantage of those (where aggregations are
> cheap).
>
> > Shouldn't a simple join on indexed fields + a simple aggregation on an
> indexed field be relatively cheap
>
> No, aggregation on mysql, on standard engines, is never cheap. Indexes
> do not help with that, they still need to count every single row (so
> you are creating a filter, of a sort, of a full tablescan of one of
> the largest tables per wiki). Why doing that when the count is already
> aggregated?
>
> I recently delivered a session on query optimization for mysql and
> mariadb, using wikimedia and osm as examples [warning, external site]
> <https://github.com/jynus/query-optimization>, maybe that would be
> helpful?
>
> --
> Jaime Crespo
> <http://wikimedia.org>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20160310/706a04fb/attachment.html>


More information about the Labs-l mailing list