Hi, if we are analysing new editors creating new articles then there are two groups I would expect to see, individual spammers and corporate spammers. 

The individual spammers are the kids writing about the band that will be the next big thing on the ....... Scene in their town, and their first gig is next Tuesday provided they can recruit a drummer.

The corporate spammers are writing about some business, but in a style that makes a corporate flyer look neutral. 


There are some implicit assumptions that we make as new page patrollers, and which some research might help by proving or disproving.

1 people whose first article is an attack page or vandalism will very rarely turn into productive editors. I have met or heard of three former vandals who made good editors, there may be more, but I've never heard of a good editor who started out creating attack pages.

2 people who are paid to spam this site are unlikely to turn into good editors, but a large proportion of good editors made COI edits among their early edits. The classic wikipedian is a time rich altruist with Internet access that they can make personal use of, I'm not seeing much overlap there with corporate spammers, but people who make COI edits other than about their employment clearly have personal access to the Internet.

3 neutral point of view is something that new editors need to pick up, and POV editing amongst a new editor's early contributions is so common that if a new account writes neutrally they are  assumed by some to be sock puppets or returning editors.

4 we live in a copy paste world, and while a student plagiarising to try to get better marks is clearly doing so in bad faith, a wikipedia contributor using copy paste is still a good faith contributor, they just need to be taught not to use copy and paste. Presumably these people are in Aaron's group 3, again we have a lot of former copyright violators in the community, and a lot of new articles get deleted as copyright violations.

My suspicion is that much of our tolerance of vandals is wasted effort. We would save a lot of time and not lose any good editors by moving from the current five strikes and you are out approach to vandals to one of blocking any new editor after their first edit if that edit was blatant vandalism. Copyvio by contrast is something where we could probably retain more of the editors by trying different approaches to explaining why their edits had to be rejected and how they could contribute.


Regards

Jonathan Cardy


On 25 Sep 2014, at 00:19, Aaron Halfaker <aaron.halfaker@gmail.com> wrote:

Sure!  You'll find the hand-coded set of users here within the next half hour (cron job copies datasets over).  

Categories: 
  1. Vandals - Purposefully malicious, out to cause harm
  2. Bad-faith - Trying to be funny, not here to help or harm
  3. Good-faith - Trying to be productive, but failing
  4. Golden - Successfully contributing productively
-Aaron

On Wed, Sep 24, 2014 at 1:15 PM, James Salsman <jsalsman@gmail.com> wrote:
Hi Aaron,

available for correlation with the number of new articles each user created?

Aaron Halfaker <ahalfaker@wikimedia.org> wrote:
>...
> I propose a project where we work together to generate 
> a summary so that I can call it "work." I've started a stub here:

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l