The query took more than an hour, so I killed it.

I think a better strategy is to first run the query on logging, and get a list of logs, then run the subquery for retrieving the edit counts.

My new query is shown below; however, I cannot create temporary tables with my user on Toolforge. I tried creating a database on "tools" server, but then i cannot access it from the server that hosts fawiki_p

What is the proper strategy for temporary (physical or in-memory) tables on Toolforge?
Huji

create temporary table logs
select
  log_id,
  log_timestamp,
  log_user,
  log_user_text,
  log_title,
  log_comment,
  log_page,
  page_namespace,
  case when ug_group = 'bot' then 1 else 0 end as user_is_bot
from logging_userindex
join page
  on page_id = log_page
left join user_groups
  on log_user = ug_user
  and ug_group = 'bot'
where log_type = 'move'
and log_id > 9406768;

select
  logs.*,
  (
    select count(1)
    from revision_userindex
    where
      rev_user = log_user
      and rev_timestamp < log_timestamp
  ) as rev_count_before_move
from logs;


On Sat, Dec 30, 2017 at 2:21 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org> wrote:
On Sat, Dec 30, 2017 at 2:10 PM, Maximilian Doerr <maximilian.doerr@gmail.com> wrote:
Isn’t there a logging_logindex table to use that should optimize this?

It looks like logging_logindex is useful for queries that are targeting indexes page_time, log_page_id_time, and type_action.

It looks like logging_userindex is useful for queries that are targeting indexes user_time, log_user_type_time, log_user_text_type_time, and log_user_text_time.

Any of the three views should work for the primary key and the type_time and times indexes.

--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation

_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud