Running an EXPLAIN on all three reveal identical indices being used. They should
theoretically all run equally fast.
Cyberpower678
English Wikipedia Account Creation Team
English Wikipedia Administrator
Global User Renamer
On Aug 7, 2019, at 22:15, Huji Lee
<huji.huji(a)gmail.com> wrote:
I have a question for the MySQL/MariaDB experts.
Short and sweet
How is that Query 1 <https://quarry.wmflabs.org/query/38237> runs in seconds and so
does Query 2 <https://quarry.wmflabs.org/query/38243>, but Query 3
<https://quarry.wmflabs.org/query/38244> -- which is essentially the same thing
except it tries to bring columns from both sides of the join -- takes forever to run?
Details
I have a bot that uses this script
<https://github.com/PersianWikipedia/fawikibot/blob/master/HujiBot/findproxy.py> to
identify and block IPs associated with open proxies. To be parsimonious with the blocks,
it only blocks the said IP and not its associated range. However, many proxy IPs belong to
a web hosting range and it would be better to block the entire range. The goal of the
query is to find all active blocks made by my bot, sort them in order of IP address, and
emulate the LEAD() function -- which we still don't have because we have not upgraded
to MariaDB 10.2 on Labs servers -- to make it easy to find cases where two consecutive IPs
start with the same two octets (like 100.24.X.Y and 100.24.C.D) so that I can manually
investigate those in more detail.
The nested SELECT that is repeated twice simply generates a list of all active blocks by
my bot. Query 1 shows all of them (88 rows) and Query 2 shows only the rows in which the
LEAD subquery row has a rownumber that is equal to 1 + that of a row in the original
88-row data. Of note, this also has 88 rows, though one of the rows is all NULLs because
for the last row of the data we should not find a match in the LEAD subqeury.
Anyhow, Query 3 simply aims to put the actual data and the LEAD subquery data side by
side and that is where things fall apart somehow. I cannot run an EXPLAIN on this through
Quarry, because Quarry does not like SQL variables :/ and I have no idea how else to
diagnose this problem.
Thanks!
Huji
_______________________________________________
Wikimedia Cloud Services mailing list
Cloud(a)lists.wikimedia.org (formerly labs-l(a)lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud