New subject: Mopping with the tap open

20 Jun 2024


      Steven, SJ, and Petr: I’ve provided responses to the questions about the
quantitative findings below. Please let me know if any additional
clarification would be helpful.
...
“*The report says "*On mobile, edit completion rate decreased by -24.3%
(-13.5pp)*" -- what's the difference between the first and (second)
percentage figures?"
Great question, SJ. Can you please let me know if the below helps clarify
the uncertainty you were asking about above?
The first percentage figure (-24.3%) indicates the relative change in
percentage between the control and test groups. In other words, by what
percentage (larger or smaller) did the edit completion rate observed in the
test group change from the edit completion rate observed in the control
group?  We observed an edit completion rate of 55.6% in the control group
and 42.1% in the test group. This equates to a 24.3% decrease, calculated
by finding the ratio of the absolute change between the two groups (42.1%
minus 55.6%) to the reference value (55.6%).
The second percentage figure (-13.5pp)  represents the absolute change
between the control and test groups. In this case, the difference is the
test edit completion rate  (42.1%) minus the control edit completion rate
(55.6%), which equals -13.5 percentage points.
Both values are provided in the report to help clarify the degree of
difference between the two numbers. But by either measure, these numbers
indicate how much change we observed in edit completion rate between the
test and control group.
...
“In other words we lose 24% of saved edits in order to decrease the
revert rate by 8.6%. This tradeoff does not seem good.”
The interaction between these two metrics is worthy of clarifying – thank
you for drawing our collective attention to the need for us to do so,
Steven.
Below is an attempt to offer some additional clarity. We'd value knowing if
this brings any new questions to mind…
The 24% decrease observed on mobile represents the relative change in the
edit completion rates observed for the control and test groups, as
indicated in the clarification provided above. It does *not* reflect a
decrease in the total number of saved edits.
If we look at the impact on saved edits, the total number of saved new
content edits on mobile decreased from 3,924 edits in the control group to
3,468 in the test group (a total decrease of 456 saved new content edits or
12% relative decrease in saved new content edits). However, Reference Check
increased the number of saved new content edits on mobile with a reference
from 60 edits in the control group to 1012 edits in the test group (an
increase of 952 saved new content edits or 16 times more saved new content
edits with a reference). See Figure 18 of the analysis report for more
details [1].
The edit completion rates for this analysis were based on a specific subset
of all the edits that were attempted during the A/B test. Specifically, we
reviewed the proportion of all edits where a person indicated intent to
save and were successfully published. We focused only on edits where a
person indicated intent to save as this is the point of the workflow when
Reference Check would be shown and we wanted to exclude edits abandoned for
other reasons before this point.
If we look at all edits that were started and then successfully published,
there was no significant change in edit completion rate on mobile or
desktop as Reference Check was presented to a limited number of all edits
that were started.
Zooming out, we seem to be aligned in thinking that it will be important
for us to actively monitor changes in edit completion rate to ensure future
Edit Checks do not cause significant disruption to the editor experience.
In fact, we'd value knowing if there are other metrics you think we should
consider monitoring. Reason being: the Editing Team is actively defining
the requirements for a dashboard (https://phabricator.wikimedia.org/T367130)
that will help us track how edit session health evolves over time as more
Checks are introduced.
[1]
https://mneisler.quarto.pub/reference-check-ab-test-report-2024/#number-of-n...

Re: Mopping with the tap open