Sorry, just realized. I want list of users who have such a first edit. Ie if user A makes an upload that's not tagged, the result shouldn't contain user A. 

Anyway, I feel the query should work that way - given it should first compute the subquery, and then limit it to rows having matching entry in change_tag. 

Martin

ne 24. 11. 2019 v 21:28 odesílatel Martin Urbanec <martin.urbanec@wikimedia.cz> napsal:
Ah, yes, first upload to Commons tagged with that tag. Thanks a lot, John!

Martin

ne 24. 11. 2019 v 20:56 odesílatel John <phoenixoverride@gmail.com> napsal:
Thats not what that query is getting. that is getting their first upload that is tagged with that change id. If you want to discount any non-upload edits I can look at optimizing it.

On Sun, Nov 24, 2019 at 2:39 PM Martin Urbanec <martin.urbanec@wikimedia.cz> wrote:
Hello, 

could someone please help me with optimizing the following query?

USE commonswiki_p;

SELECT first_upload, uploads, username FROM
(
  SELECT MIN(log_timestamp) AS first_upload, MIN(log_id) AS first_upload_id, COUNT(log_timestamp) AS uploads, log_user_text AS username
  FROM logging_compat
  LEFT JOIN user ON user_id = log_user
  JOIN page ON log_page = page_id
  WHERE log_type = "upload" AND (log_action = "upload" OR log_action = "overwrite") AND user_registration > "20190101000000"
  GROUP BY log_user
) AS first_uploads
JOIN change_tag ON ct_log_id = first_upload_id
WHERE ct_tag_id=21;


It takes over 30 minutes :/. I want to have a list of users whose first contrib to Wikimedia Commons is tagged with tag number 21. 

Thanks!

Martin
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud@lists.wikimedia.org (formerly labs-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud