Gregory Maxwell wrote:
If you were using "is gay" as a measure of vandalism over time you might conclude that vandalism is decreasing when in reality "cluebot" is performing the same kind of analysis for its automatic vandalism suppression and the vandals have responded by vandalizing in forms that can't be automatically identified, such as by changing dates to incorrect values.
And if that's true, that's on net a bad thing. Most "is gay" vandalism (not all) is just stupid embarassing and it will be obvious to the reader as vandalism, and lots of people get how Wikipedia works and are reasonably tolerant of seeing that sort of thing from time to time.
But people expect that we should get the dates right, and they are right to ask that of us.
I understand that you're just making up a hypothetical, not saying that this is what is actually happening. I'm just agreeing with this line of thinking that says, in essence, "when we think about measuring vandalism, which is already hard enough, we also have to think about how damaging different kinds of vandalism actually are".
Greg, I think your email sounded a little negative at the start, but not so much further down. I think you would join me heartily in being super grateful for people doing this kind of analysis. Yes, some of it will be primitive and will suffer from the many difficulties. But data-driven decisionmaking is a great thing, particularly when we are cognizant of the limitations of the data we're using.
I just didn't want anyone to get the idea (and I'm sure I'm reading you right) that you were opposed to people doing research. :-)
--Jimbo