On Sun, Jun 2, 2019 at 7:43 PM Jesse Plamondon-Willard
<pathoschild(a)gmail.com> wrote:
Hello,
Some Toolforge SQL queries have much worse performance when updated for the new comment
table.
For example, see one previous SQL query and its updated version which just adds a join on
the comment table. The change adds ten dependent subqueries and increases Stalktoy load
times from ≈15 seconds to ≈245 seconds, and the same change for another query was enough
to have my tools temporarily rate-limited.
I haven't found a way to update those queries efficiently. Is there an optimisation
I'm missing? Why does the comment view need subqueries on ten other tables, and is
there an alternate version without the subqueries for cases where we just need to join by
ID?
I don't have a good answer for the performance question, but I can
provide some information about where the large number of subqueries
come from. They are an artifact of the way that the comment table is
exposed by the Wiki Replicas view layer. Comments can be suppressed
and this suppression can be toggled back and forth over time. The flag
for suppression lives in the related entity table (ipblocks in this
case) rather than the comment table itself. Our current data
sanitization system deals with this by deciding at runtime if a given
comment should be visible or not to Wiki Replica users. This is done
via a view that is defined in our Puppet repository [0]. This view is
not aware of the context that it being used for in a given query, so
it tests all of the entity table flags that could indicate suppression
for each comment row.
One "fix" I could imagine for this would be for the Wiki Replicas to
also provide per-entity views of the comment table that only return
rows related to a single entity table. For example a
"comment_ipblocks" view could use a where clause specific to the
ipblocks entity table like "WHERE exists( select 1 from ipblocks where
ipb_reason_id = comment_id and ipb_deleted = 0)" and exclude the tests
for other entities (image, filearchive, revision, etc).
[0]:
https://phabricator.wikimedia.org/source/operations-puppet/browse/productio…
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Technical Engagement Boise, ID USA
irc: bd808 v:415.839.6885 x6855