Re: [Wiki-research-l] More accurate revert detection in Wikipedia, alternative to MD5 identical revision method

29 Jun 2012

      <quote who="Aaron Halfaker" date="Wed, Jun 27, 2012 at 04:39:30PM -0700">
...
I'm confused by your explanation.
How is it possible that this 37% of revisions that are detected as reverts
via a md5 hash are not considered reverts by (I presume) humans?  Can you
give a common example?  By definition, identity revert revisions represent
an exact replica of a previous revision in an article and, therefore,
should discard any intermediate changes.  What definition of "revert" are
you using that the md5 hash method does not satisfy?
Also, I can't tell from either the paper or the conversation here: Are
Are you limiting this to edits that are separated by an revisions with
identical hashes by only one edit? When you do that, things become a
bit more complicated.
And are you sure your human coders aren't just relying on edit
summaries? Like Aaron, I'm having a hard time imagining a situation
where revisions go HASH-A => HASH-B => HASH-A that shouldn't be
treated like a revert and think tend to think this sounds more like
fallible than broken tools. If the user doesn't *know* or think they
are reverting an edit, it seems wrong to *not* to call that a revert.
Later,
Mako
-- 
Benjamin Mako Hill
mako@mit.edu
http://mako.cc/

Creativity can be a social contribution, but only in so far
as society is free to use the results. --GNU Manifesto

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Wiki-research-l] More accurate revert detection in Wikipedia, alternative to MD5 identical revision method