On Sun, Jun 2, 2019 at 7:43 PM Jesse Plamondon-Willard pathoschild@gmail.com wrote:
Hello,
Some Toolforge SQL queries have much worse performance when updated for the new comment table.
For example, see one previous SQL query and its updated version which just adds a join on the comment table. The change adds ten dependent subqueries and increases Stalktoy load times from ≈15 seconds to ≈245 seconds, and the same change for another query was enough to have my tools temporarily rate-limited.
I haven't found a way to update those queries efficiently. Is there an optimisation I'm missing? Why does the comment view need subqueries on ten other tables, and is there an alternate version without the subqueries for cases where we just need to join by ID?
I don't have a good answer for the performance question, but I can provide some information about where the large number of subqueries come from. They are an artifact of the way that the comment table is exposed by the Wiki Replicas view layer. Comments can be suppressed and this suppression can be toggled back and forth over time. The flag for suppression lives in the related entity table (ipblocks in this case) rather than the comment table itself. Our current data sanitization system deals with this by deciding at runtime if a given comment should be visible or not to Wiki Replica users. This is done via a view that is defined in our Puppet repository [0]. This view is not aware of the context that it being used for in a given query, so it tests all of the entity table flags that could indicate suppression for each comment row.
One "fix" I could imagine for this would be for the Wiki Replicas to also provide per-entity views of the comment table that only return rows related to a single entity table. For example a "comment_ipblocks" view could use a where clause specific to the ipblocks entity table like "WHERE exists( select 1 from ipblocks where ipb_reason_id = comment_id and ipb_deleted = 0)" and exclude the tests for other entities (image, filearchive, revision, etc).
[0]: https://phabricator.wikimedia.org/source/operations-puppet/browse/production...
Bryan