A new app by Thomas Steiner (@tomayac) counting bot vs human edits in real time from the RecentChanges feed:
http://wikipedia-edits.herokuapp.com/
(read more [2]). The application comes with a public API exposing Wikipedia and Wikidata edits as Server-Sent Events. [1]
Dario
[1] http://blog.tomayac.com/index.php?date=2013-10-14&time=16:49:46&perm... [2] https://en.wikipedia.org/wiki/Server-sent_events
Very cool. If you include wikidata then more than 50% of the edits on the Wikimedia projects are made by bots. One of the dead horses I like to beat is that bot editors should be treated as first class citizens of Wikipedia and this data nicely illustrates that. I think this is a bigger watershed moment (we might have reached this threshold a while back) then mobile vs non-mobile and we should have a way more rigorous discussion about the future of bots on Wikipedia. Particularly as all our big features are aimed at human editors :) D
On Mon, Oct 14, 2013 at 12:30 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
A new app by Thomas Steiner (@tomayac) counting bot vs human edits in real time from the RecentChanges feed:
http://wikipedia-edits.herokuapp.com/
(read more [2]). The application comes with a public API exposing Wikipedia and Wikidata edits as Server-Sent Events. [1]
Dario
[1] http://blog.tomayac.com/index.php?date=2013-10-14&time=16:49:46&perm... [2] https://en.wikipedia.org/wiki/Server-sent_events
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wow, thanks for making that. I had thought that the combination of shifting vandalfighting to the edit filters and the change of intrawikis from peer to peer to hub and spoke had reduced the number of bot edits. But then I find there are bots following me around - when I semi protect an article a bot follows me and adds the little silver padlock icon on the page. Which gets me to the more serious point about measuring things by edit count. All edits are far from equal, someone who manually writes several paragraphs of encyclopaedic content is contributing something far more valuable than for example my recategorising 30 images from one category to another. I'd go so far as to say that in that example the one edit that writes paragraphs of text involves more work and is of perhaps thirty times the "value" of my thirty edits. But measured on edit count we would value them the other way round.
I think that this is such a distortion that it risks skewing our whole understanding of the project, and I'm wondering if anyone has found another way to try and measure someone's contributions? In other spheres of human endeavour you might try and estimate how many hours of work had been contributed by particular people or groups of people. You could I suspect get part of the way there by only counting the number of unique calender hours in which someone had made edits. Some people routinely do hundreds of manual edits in an hour, and such people would on this measure have contributed far fewer hours than those whose edits are rarely less than ten minutes apart. Of course estimating the hours of effort by counting the number of unique hours in which one has edited would over estimate the effort of the hypothetical editor who occasionally contributes the odd couple of minutes, and under estimate the efforts of someone who takes more than an hour before they actually hit save. But it would probably be much closer to people's relative donation of time to the project than simply looking at raw edit count.
On a related note, I'm looking for ways to test the hypothesis that some of the drop in edit count between our 2007 peak and the introduction of the edit filters in 2009 was simply due to bots reverting vandals faster, and getting individual vandals through the warning levels and blocked quicker and for fewer total edits than would have happened in manual vandalfighting. Can anyone suggest ways to test that hypothesis or measure its effect?
Regards
Jonathan
On 14 October 2013 17:40, Diederik van Liere dvanliere@wikimedia.orgwrote:
Very cool. If you include wikidata then more than 50% of the edits on the Wikimedia projects are made by bots. One of the dead horses I like to beat is that bot editors should be treated as first class citizens of Wikipedia and this data nicely illustrates that. I think this is a bigger watershed moment (we might have reached this threshold a while back) then mobile vs non-mobile and we should have a way more rigorous discussion about the future of bots on Wikipedia. Particularly as all our big features are aimed at human editors :) D
On Mon, Oct 14, 2013 at 12:30 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
A new app by Thomas Steiner (@tomayac) counting bot vs human edits in real time from the RecentChanges feed:
http://wikipedia-edits.herokuapp.com/
(read more [2]). The application comes with a public API exposing Wikipedia and Wikidata edits as Server-Sent Events. [1]
Dario
[1] http://blog.tomayac.com/index.php?date=2013-10-14&time=16:49:46&perm... [2] https://en.wikipedia.org/wiki/Server-sent_events
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
On Thu, Oct 17, 2013 at 12:00 PM, WereSpielChequers werespielchequers@gmail.com wrote:
All edits are far from equal, someone who manually writes several paragraphs of encyclopaedic content is contributing something far more valuable than for example my recategorising 30 images from one category to another. I'd go so far as to say that in that example the one edit that writes paragraphs of text involves more work and is of perhaps thirty times the "value" of my thirty edits. But measured on edit count we would value them the other way round.
Wouldn't it be quite easy to weight the edit counts by text added/deleted?
Hi Joe,
Yes, total bytes changed would be a relatively easy weighting factor, but not I suspect a useful one. So many edits are semi automated addition of large templates, or reversion of previous changes that you would need to do some complex filtering to identify the "amount of meaningful work". And it isn't just the individual edit that you need to consider, like most active editors I have a number of scripts or tools that I have opted into. In one extreme example, if I use Twinkle to nominate an article for deletion by AFD, all I have to do is click on a couple of menus and type a one sentence case for deleting the article and submit. My account then does the following edits:
1 creates a page for the deletion discussion with various bits of code including one copy of my deletion rationale. 2 Lists that deletion discussion on the page for that days deletion discussions. 3 Templates the article with a warning that it is being considered for deletion, with the rational and several sentences of verbiage. 4 Writes a template to the author's talkpage telling them what I have done, quotes the rational and explains what they can do about it
Note by Wikipedia standards these are four manual edits each adding a generous paragraph or more and all generated by the writing of a rationale that could be shorter than "Not yet played so not yet notable".
For example, I just went to recent changes, by far the biggest edit of that moment was this onehttps://en.wikipedia.org/w/index.php?title=Climate_change_in_Sweden&diff=next&oldid=577633136. At first glance it look like someone added 1500 bytes of encyclopaedic text. Then you realise they merely reverted a vandalism of three minutes earlier.
Then there's the issue that some people will go through an article fixing an assortment of typos and perhaps also rephrasing the English, that can be quite a lot of time spent and many changes with little or any total change in bytes. I'm sure it would be possible to come up with some set of filters that compensates for most of this. But it won't be simple and it would always be vulnerable to someone having some new and or undocumented way of generating text.
On 17 October 2013 12:18, Joe Corneli holtzermann17@gmail.com wrote:
On Thu, Oct 17, 2013 at 12:00 PM, WereSpielChequers werespielchequers@gmail.com wrote:
All edits are far from equal, someone who manually writes several
paragraphs of
encyclopaedic content is contributing something far more valuable than
for
example my recategorising 30 images from one category to another. I'd
go so
far as to say that in that example the one edit that writes paragraphs of text involves more work and is of perhaps thirty times the "value" of my thirty edits. But measured on edit count we would value them the other
way
round.
Wouldn't it be quite easy to weight the edit counts by text added/deleted?
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
WSC,
Halfaker and Geiger have done some work on an intuitive way of measuring work/contribution. Instead of edit count, they propose the idea of an edit session: http://meta.wikimedia.org/wiki/Research:Metrics/edit_sessions
I have not seen this widely adopted in research, but I think it is a compelling idea with more validity than mere edit count. Then again, *within-user changes to edit count over time *seems to be a sufficient measuring of work/contribution in certain research scenarios as well. I guess it all depends on the question.
Cheers, Michael
On Thu, Oct 17, 2013 at 7:00 AM, WereSpielChequers < werespielchequers@gmail.com> wrote:
Wow, thanks for making that. I had thought that the combination of shifting vandalfighting to the edit filters and the change of intrawikis from peer to peer to hub and spoke had reduced the number of bot edits. But then I find there are bots following me around - when I semi protect an article a bot follows me and adds the little silver padlock icon on the page. Which gets me to the more serious point about measuring things by edit count. All edits are far from equal, someone who manually writes several paragraphs of encyclopaedic content is contributing something far more valuable than for example my recategorising 30 images from one category to another. I'd go so far as to say that in that example the one edit that writes paragraphs of text involves more work and is of perhaps thirty times the "value" of my thirty edits. But measured on edit count we would value them the other way round.
I think that this is such a distortion that it risks skewing our whole understanding of the project, and I'm wondering if anyone has found another way to try and measure someone's contributions? In other spheres of human endeavour you might try and estimate how many hours of work had been contributed by particular people or groups of people. You could I suspect get part of the way there by only counting the number of unique calender hours in which someone had made edits. Some people routinely do hundreds of manual edits in an hour, and such people would on this measure have contributed far fewer hours than those whose edits are rarely less than ten minutes apart. Of course estimating the hours of effort by counting the number of unique hours in which one has edited would over estimate the effort of the hypothetical editor who occasionally contributes the odd couple of minutes, and under estimate the efforts of someone who takes more than an hour before they actually hit save. But it would probably be much closer to people's relative donation of time to the project than simply looking at raw edit count.
On a related note, I'm looking for ways to test the hypothesis that some of the drop in edit count between our 2007 peak and the introduction of the edit filters in 2009 was simply due to bots reverting vandals faster, and getting individual vandals through the warning levels and blocked quicker and for fewer total edits than would have happened in manual vandalfighting. Can anyone suggest ways to test that hypothesis or measure its effect?
Regards
Jonathan
On 14 October 2013 17:40, Diederik van Liere dvanliere@wikimedia.orgwrote:
Very cool. If you include wikidata then more than 50% of the edits on the Wikimedia projects are made by bots. One of the dead horses I like to beat is that bot editors should be treated as first class citizens of Wikipedia and this data nicely illustrates that. I think this is a bigger watershed moment (we might have reached this threshold a while back) then mobile vs non-mobile and we should have a way more rigorous discussion about the future of bots on Wikipedia. Particularly as all our big features are aimed at human editors :) D
On Mon, Oct 14, 2013 at 12:30 PM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
A new app by Thomas Steiner (@tomayac) counting bot vs human edits in real time from the RecentChanges feed:
http://wikipedia-edits.herokuapp.com/
(read more [2]). The application comes with a public API exposing Wikipedia and Wikidata edits as Server-Sent Events. [1]
Dario
[1] http://blog.tomayac.com/index.php?date=2013-10-14&time=16:49:46&perm... [2] https://en.wikipedia.org/wiki/Server-sent_events
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org