On Thu, Nov 20, 2014 at 5:43 PM, Nathan <nawrich@gmail.com> wrote:
This research excludes any incidence of the word "cunt" that appears in revisions other than those that are currently live, right? Since it relies exclusively on the search engine, which only returns live results? Doesn't that seem like a substantial weakness, and one that undercuts any year to year frequency analysis? 

The search results included a huge number of years in them, and we're not talking a small percentage.  I don't think it undercuts it at all because it seems like the search results are random enough to get a statistically relevant sample.  I talked about the sample size and methodology with a few researchers, including ones doing Wikipedia research who thought the sample size was adequate and representational for the findings.  

At the same time, once I got past 20%, there was pretty much no substantial change in any of the numbers on a yearly basis.  I went up to 25% just to make sure.  This also confirmed for me the sample size being fine.

Do you have some reason to think they wouldn't be statistically relevant in terms of sample size?  This isn't my normal research methodology so I could be wrong, but I'd love to know more about what factors you think would lead to them not being representative across all years.


 

Also, can you confirm if I'm understanding this correctly - the use of the word in a discussion about whether it is a gendered insult or not, or in a discussion admonishing a user to stop using it as an insult, counts each use (regardless of intent) as an additional incident of gendered insult? 


Yes, "I think you should stop using cunt because it is offensive to women" would count as a gendered insult in the context of what I did, because it was clear from context that the word conveyed a gender loaded use of it.  That use is clearly different than "You're a cunt."  I could change the "gendered insult" to "gendered" but the point remains the same.  I reviewed this classification with several people before posting, and no one mentioned that as an issue because it was clear to them what the point of the classification was in terms of this particular discursive analysis.

There was no separate category for "admonishment" or "discussion about not using this word on talk pages".  If there was, the numbers for that would be pretty low because admonishments are pretty rare, and in many cases were met with more excuses for its usage.  That actually should be apparent based on the number of quotes for support versus admonishments.  It was actually a struggle to find them, and they are over represented in the samples compared to the "cunt is okay" examples.

Sincerely,
Laura Hale