On 7/27/15, Ryan Lane <rlane32(a)gmail.com> wrote:
Brian Wolff <bawolff@...> writes:
Data does not prove things "good". Data
proves (or more likely provides
some support but not proves) some objective
hypothesis. Proving normative
claims with objective data is pretty impossible.
That may sound pendantic, but i think its an
Evidence should be presented in the form of "This
findability of the edit button by 40% among anons in our experiment [link to
details]. Therefor I/we believe this is a good change because I/we think
that findability of edit button is important". Separating what the data
proves and what are personal opinions about the data is important to make
the "science" sound legitament and not manipulatrd.
It sounds pedantic because it is :). Good/bad in my proposal was targeting
the hypothesis, not the moral concept of good/bad. Good = the hypothesis is
shown to be effective; bad = the hypothesis is shown to be ineffective.
At the risk of being a bit nitpicky here, if that's the case, and when
you say "...there's always some vocal minority that will hate change,
even when it's presented with data proving it to be good", what you
really mean is "... there's always some vocal minority that will hate
change, even when it's presented with data proving that there exists
some hypothesis about the change that can be shown to be in effect".
Similarly, when you say "The data is the voice of the community. It's
what proves if an idea
is good or bad.", what you really mean "The data is the voice of the
community. It's what proves if an idea has a hypothesis which has been
shown to be in effect or not be in effect."
Well I tend to agree with the versions of these statements where
"good" means a hypothesis is in effect, I don't think they make for a
very compelling argument.
What you've ignored in my proposal is the part where the community input is
part of the formation of the hypothesis. I also mentioned that vocal
minorities should be ignored with the exception of questioning the
methodology of the data analysis.
Fair enough. Well I don't think data "experiments" should be the
be-all and end-all, its certainly a useful tool. I agree that ensuring
community input in hypothesis formation and methodology critique is
vital to make sure that we make the best use of this tool.
Anecdotal data should be used as a means of following up on experiments, but
should not be considered in the data set as it's an unreliable source. If
there's a large amount of anecdotal data coming in, it's something that
should be part of the standard data set. There's obviously exceptions to
this, but assuming there's enough data it should be possible to gauge the
effectiveness of changes without relying on anecdotal data.
For instance, if a change negatively affects an editor's workflow, it should
be reflected in data like "avg/p95/p99 time for x action to occur", where x
is some normal editor workflow.
Say we wanted to improve discoverability of the edit button for new
users. So we put it in <blink> tags. This pisses off everyone for the
obvious reason. How would we measure user aggravation?