On Mon, Aug 25, 2014 at 11:41 AM, Steven Walling <swalling(a)wikimedia.org>
On Mon, Aug 25, 2014 at 11:05 AM, Ryan Kaldari <rkaldari(a)wikimedia.org>
There is nothing stopping us, however, from
analysing *relative* trends
using existing data. For example, we could generate graphs showing the
relative difference per month in edits by men and women and this data would
be unaffected by the unreliability of the absolute numbers (since we would
only be looking at changes in the percentages).
Using bad data here is worse than having no data. As Aaron and I
recommended when we talked in person, we should not invest is using the
gendered language preference data to track overall gender among editors.
It's a case of garbage in, garbage out. Instead, we should be investing in
more reliable ways to track gender among the editor population, if it's a
metric that we care about.
You can get accurate information from bad or incomplete data. For example,
I can measure changes in tide levels without knowing the volume of the
ocean. That's all I'm proposing doing here, measuring the change per month.
Please take a look at the Trello card for a more complete description of