It looks to me like the query attempts to count all logged-in non-bot page creations in certain namespaces, grouped by user, and each user also has their counts for the various different individual namespaces shown.

On Wed, 5 Jun 2019 at 09:53, Jaime Crespo <> wrote:
> Finally, any advice on how to make it more efficient is highly appreciated.

I would be more concerned about the derived subquery, as that can
create many executions of the inside query for every row of the
outside one.

If you could update the query with a comprehensive comment of what you
are trying to achieve, that would help optimizing it, because some
times transformation requires a logically different query but one that
obtains equal or similar results.

On Wed, Jun 5, 2019 at 2:21 AM Bryan Davis <> wrote:
> On Tue, Jun 4, 2019 at 4:54 PM Huji Lee <> wrote:
> >
> > Hi all,
> >
> > I have a query that used to do a GROUP BY on rev_user and now I have migrated it to do so using rev_actor instead. The latest version looks like this [0] and its performance is not good [1] at all. In fact, I left it running for fawiki for several hours and it did not generate any results (it used to generate results in about 30-40 minutes). Also, the execution plan that is shown in the second link involves some tables like filearchive, image and oldimage that are not part of the query.
> >
> > Is this a side effect of the revision_actor_temp table?
> The EXPLAIN output including filearchive, image, oldimage, and other
> apparently unrelated fields is caused by the view for the actor table.
> This is basically the same problem as I described in
> <>
> for the view of the comment table. The flag for suppression lives in
> the related entity table, so the actor view runs correlated
> sub-queries for each entity table that could indicate that a specific
> row from the actor table has been suppressed.
> I created <> to look for a
> solution for this same class of issue in the comment table. If we find
> a fix there it seems likely that we could do the same thing for the
> actor table, whatever that may end up being.
> > Also, what is that "key0" about, which is not associated with any tables?
> This is a key generated by the database on a temporary table that is
> being used to answer your query. It looks like there is a little
> display bug in the sql-optimizer tool. The table name that should be
> shown is "<derived2>" which you can see in the HTML source of the
> page. See <>
> for some details on this type of internal optimization.
> > Finally, any advice on how to make it more efficient is highly appreciated.
> The "Using filesort" notes in the plan are caused by the GROUP BYs. If
> you can figure out how to get rid of the GROUP BY clauses things
> should be faster, but to do that I think you might need to collect
> information using multiple queries and some scripting language to
> collect and correlate the data outside of the database.
> I would generally expect any query that tries to compute statistics
> for all editors who have ever been active on a wiki will be slow. You
> might try finding a good way to break the query into smaller batches.
> > [0]:
> > [1]:
> Bryan
> --
> Bryan Davis              Wikimedia Foundation    <>
> [[m:User:BDavis_(WMF)]] Manager, Technical Engagement    Boise, ID USA
> irc: bd808                                        v:415.839.6885 x6855
> _______________________________________________
> Wikimedia Cloud Services mailing list
> (formerly

Jaime Crespo

Wikimedia Cloud Services mailing list (formerly