Hi all,
I have a query that used to do a GROUP BY on rev_user and now I have migrated it to do so using rev_actor instead. The latest version looks like this https://pastebin.com/jWTzsiJY and its performance is not good https://tools.wmflabs.org/sql-optimizer?use=fawiki_p&sql=SELECT%0D%0A++actor_name%2C%0D%0A++SUM%28%0D%0A++++CASE%0D%0A++++++WHEN+page_namespace+%3D+0+THEN+1%0D%0A++++++ELSE+0%0D%0A++++END%0D%0A++%29+article%2C%0D%0A++SUM%28%0D%0A++++CASE%0D%0A++++++WHEN+page_namespace+%3D+10+THEN+1%0D%0A++++++ELSE+0%0D%0A++++END%0D%0A++%29+tpl%2C%0D%0A++SUM%28%0D%0A++++CASE%0D%0A++++++WHEN+page_namespace+%3D+12+THEN+1%0D%0A++++++ELSE+0%0D%0A++++END%0D%0A++%29+helppage%2C%0D%0A++SUM%28%0D%0A++++CASE%0D%0A++++++WHEN+page_namespace+%3D+14+THEN+1%0D%0A++++++ELSE+0%0D%0A++++END%0D%0A++%29+cat%2C%0D%0A++SUM%28%0D%0A++++CASE%0D%0A++++++WHEN+page_namespace+%3D+100+THEN+1%0D%0A++++++ELSE+0%0D%0A++++END%0D%0A++%29+portal%2C%0D%0A++COUNT%28rev_first%29+tot%0D%0AFROM+revision+r%0D%0AJOIN+%28%0D%0A++select%0D%0A++++MIN%28rev_id%29+rev_first%2C%0D%0A++++rev_page%0D%0A++FROM+revision%0D%0A++GROUP+BY+rev_page%0D%0A%29+f%0D%0A++ON+r.rev_id+%3D+f.rev_first%0D%0AJOIN+page%0D%0A++ON+page_id+%3D+r.rev_page%0D%0AJOIN+actor%0D%0A++ON+rev_actor+%3D+actor_id%0D%0ALEFT+JOIN+user_groups%0D%0A++ON+actor_user+%3D+ug_user%0D%0A++AND+ug_group+%3D+%22bot%22%0D%0AWHERE%0D%0A++actor_user+%3C%3E+0%0D%0A++AND+ug_group+IS+NULL%0D%0A++AND+page_namespace+IN+%28%0D%0A++++0%2C%0D%0A++++10%2C%0D%0A++++12%2C%0D%0A++++14%2C%0D%0A++++100%0D%0A++%29%0D%0AGROUP+BY+rev_actor%0D%0AORDER+BY+tot+desc%0D%0ALIMIT+300 at all. In fact, I left it running for fawiki for several hours and it did not generate any results (it used to generate results in about 30-40 minutes). Also, the execution plan that is shown in the second link involves some tables like filearchive, image and oldimage that are not part of the query.
Is this a side effect of the revision_actor_temp table?
Also, what is that "key0" about, which is not associated with any tables?
Finally, any advice on how to make it more efficient is highly appreciated.
Thanks! Huji