I thought I should add this too as I missed it in the previous email.
This link: https://en.wikipedia.org/wiki/Wikipedia:Ethically_researching_Wikipedia
talks about the Content Analysis (seeing number of references removed, or content removed)-- which we did (with the few articles)  and that is what we followed as it says "generally considered exempt from such requirements and does not require an IRB approval.". 
My advisor should be able to add more thoughts on it (I have requested him to reply on this thread).

Thanks,
Sidd




On Thu, Aug 11, 2016 at 9:36 PM, siddhartha banerjee <sidd2006@gmail.com> wrote:
As I have mentioned earlier, this is not the first work on article generation. This is one of the first work we know: https://people.csail.mit.edu/csauper/pubs/sauper-sm-thesis.pdf
https://people.csail.mit.edu/regina/my_papers/wiki.pdf
All these did not mention anything about human subjects as finally no personal information is used (about the person, who is deleting, etc). Nor did any reviewers/attendees in the conferences in this area question on this aspect. 
Also, https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-01-28/Recent_research is relevant here as it talks about our previous work.

if "record of someone doing something" is relevant from human subjects point of view, any data on Wikipedia can be used to find the editors (if not the real person). For example:
https://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/viewFile/3505/3968
https://alchemy.cs.washington.edu/papers/wu08/wu08.pdf
I have met several researchers who work using data (revisions from Wikipedia) and nothin on IRB ever came up.

Nevertheless, as I said, if there are concrete rules, I think it would help the research community as a whole to know what can or cannot be done and also ask for permissions.
I appreciate the suggestions that Stuart mentioned in a previous email abut experimenting on would be deleted or articles lacking sources. But, as of now we are not planning anything and if we do, we would for sure get in touch with Denny (who had a video chat with me before starting this thread) and would try to know the best ways of doing it.

I have asked my PhD advisor (other author on the paper) to check this thread and he will be able to give more inputs as I am not very qualified to comment on these aspects. 

Thanks,
Sidd