Brian Wolff <bawolff@...> writes:
Data does not prove things "good". Data
proves (or more likely provides
some support but not proves) some objective
hypothesis. Proving normative
claims with objective data is pretty impossible.
That may sound pendantic, but i think its an important
distinction.
Evidence should be presented in the form of "This change improved
findability of the edit button by 40% among anons in our experiment [link to
details]. Therefor I/we believe this is a good change because I/we think
that findability of edit button is important". Separating what the data
proves and what are personal opinions about the data is important to make
the "science" sound legitament and not manipulatrd.
It sounds pedantic because it is :). Good/bad in my proposal was targeting
the hypothesis, not the moral concept of good/bad. Good = the hypothesis is
shown to be effective; bad = the hypothesis is shown to be ineffective.
What you've ignored in my proposal is the part where the community input is
part of the formation of the hypothesis. I also mentioned that vocal
minorities should be ignored with the exception of questioning the
methodology of the data analysis.
Consensus of the editor commmunity is ancedotal data.
That data may be
extremely biased and should be evaluated carefully. But it doesnt
make sense
to just throw it out totally, particularaly in cases where its the only data
we have. We should also be evaluating why consensus and data are
conflicting. Maybe there are unstudied factors causing the conflict so the
two positions are not mutually exclusive.
--
Anecdotal data should be used as a means of following up on experiments, but
should not be considered in the data set as it's an unreliable source. If
there's a large amount of anecdotal data coming in, it's something that
should be part of the standard data set. There's obviously exceptions to
this, but assuming there's enough data it should be possible to gauge the
effectiveness of changes without relying on anecdotal data.
For instance, if a change negatively affects an editor's workflow, it should
be reflected in data like "avg/p95/p99 time for x action to occur", where x
is some normal editor workflow.
- Ryan