[WikiEN-l] Flagged protection and patrolled revisions
William Pietri
william at scissor.com
Thu May 6 21:51:05 UTC 2010
On 05/03/2010 06:13 PM, phoebe ayers wrote:
> Is there a good usability-based way
> to do testing for these questions? (Has it been done, or discussed
> somewhere?) All I've got to go on is gut feelings one way or another.
>
Great question!
There are two broad sorts of testing typically done for things like
this: qualitative and quantitative.
For the qualitative approach, you'll sit some representative user down
and have them perform some tasks. You sit quietly and watch them
carefully, often recording both the screen and their faces. Some people
like to have them narrate their actions and thoughts, so you can get a
better idea of how they're taking things. Advance versions even include
eye-tracking, so you can get better insight into what they're looking at
and reacting to. This can be done in person or remotely, and there are
sites like usertesting.com to automate the process.
There are a number of quantitative approaches, but the most popular is
A/B testing, where we'd deploy two different versions of the feature and
assign a bunch of users to each group. Then we'd look at their behavior
over time with respect to certain metrics, like returning to look at
their articles, talk page participation, and long-term editing behavior.
Right now the WMF is doing a little of the first, and -- AFAIK -- none
of the second. The qualitative testing is relatively expensive and time
consuming to do. For a minor variation like this, qualitative testing is
probably better done as part of a broader test of novice editing
experiences.
I'd love to do more quantitative testing, as that would settle a lot of
questions like this, but being set up for that would require non-trivial
infrastructure changes, plus a serious discussion about how much
cookie-based user tracking we want to do. Right now there's other
infrastructure work that is definitely higher priority. E.g., making
sure that a well-placed hurricane won't cause major site issues.
So for now, although I wish it were otherwise, we may have to settle for
trying things out live and trying to calibrate our gut feelings based on
what user reactions we can glean.
William
More information about the WikiEN-l
mailing list