Just an example of long-term unreverted vandalism for your amusement:
http://en.wikipedia.org/w/index.php?title=Art_Tatum&diff=32294907&ol...
External links, language links, categories and a sound sample reference were deleted in December 2005 by an anon. The sound sample itself was permanently deleted a month later citing "orphaned fair use". The lost sections were progressively rebuilt by human effort over the 18 months following, except for the sound sample which remains lost.
-- Tim Starling
On 7/27/07, Tim Starling tstarling@wikimedia.org wrote:
Just an example of long-term unreverted vandalism for your amusement:
http://en.wikipedia.org/w/index.php?title=Art_Tatum&diff=32294907&ol...
External links, language links, categories and a sound sample reference were deleted in December 2005 by an anon. The sound sample itself was permanently deleted a month later citing "orphaned fair use". The lost sections were progressively rebuilt by human effort over the 18 months following, except for the sound sample which remains lost.
-- Tim Starling
This kind of vandalism happens, for some reason, quite often. See http://en.wikipedia.org/wiki/User:Alai/prevcat-Jan07b. Many of the articles which are/were in that list had the same kind of vandalism. External links, references, categories & interwiki links were removed.
Garion96
On 7/27/07, Tim Starling tstarling@wikimedia.org wrote: Just an example of long-term unreverted vandalism for your amusement:
http://en.wikipedia.org/w/index.php?title=Art_Tatum&diff=32294907&ol...
External links, language links, categories and a sound sample reference were deleted in December 2005 by an anon. The sound sample itself was permanently deleted a month later citing "orphaned fair use". The lost sections were progressively rebuilt by human effort over the 18 months following, except for the sound sample which remains lost.
-- Tim Starling
Tim, an ogg file named Image:Elegie (Art Tatum).ogg is on commons. Is it the same file?
To avoid the need to delete orphaned media, it would be handy to have a brief list of articles where a media has been used in the past - the last five articles would usually be enough to help work out why it has been orphaned.
On 7/27/07, Garion96 garion96@gmail.com wrote:
This kind of vandalism happens, for some reason, quite often. See http://en.wikipedia.org/wiki/User:Alai/prevcat-Jan07b. Many of the articles which are/were in that list had the same kind of vandalism. External links, references, categories & interwiki links were removed.
I've just worked through 34 on that list, and only four were actually damaged. The majority of those entries appear to be due to maintenance tags and normal editing.
-- John
On 7/27/07, John Vandenberg jayvdb@gmail.com wrote:
On 7/27/07, Garion96 garion96@gmail.com wrote:
This kind of vandalism happens, for some reason, quite often. See http://en.wikipedia.org/wiki/User:Alai/prevcat-Jan07b. Many of the
articles
which are/were in that list had the same kind of vandalism. External
links,
references, categories & interwiki links were removed.
I've just worked through 34 on that list, and only four were actually damaged. The majority of those entries appear to be due to maintenance tags and normal editing.
-- John
'
It also is an old list but I think there might be a newer one somewhere. I don't know for sure since I haven't worked on it much since becoming an admin. Still...4 out of 34 is l relatively a lot.
Garion96
On 7/27/07, John Vandenberg jayvdb@gmail.com wrote: [snip]
To avoid the need to delete orphaned media, it would be handy to have a brief list of articles where a media has been used in the past - the last five articles would usually be enough to help work out why it has been orphaned.
When I checked a while in the past (~1yr ish ago) the majority of the new orphaned media was either never used in an article or not used in an article for longer than 6 hours. (I saved snapshots of imagelinks every 6 hours).
This might upload and insert into an article process is too difficult for the average drive-by-user to manage... but that isn't news prior usability studies showed this as well. Fortunately the overwhelming majority of the images uploaded by these users are images pulled off the web, not much lost labor... and out of the legal and original works uploaded by these people I expect that most lack enough information for us to know about their licensing status. Alas we have no way to reach these people, since they seldom provide a confirmed email address. :(
While there are certainly a lot of changes to the upload process that will confuse people, no matter how good we make it some people will still be confused.
We need to find better ways to leverage our community to provide friendly realtime help to new users.
John Vandenberg wrote:
On 7/27/07, Tim Starling tstarling@wikimedia.org wrote: Just an example of long-term unreverted vandalism for your amusement:
http://en.wikipedia.org/w/index.php?title=Art_Tatum&diff=32294907&ol...
External links, language links, categories and a sound sample reference were deleted in December 2005 by an anon. The sound sample itself was permanently deleted a month later citing "orphaned fair use". The lost sections were progressively rebuilt by human effort over the 18 months following, except for the sound sample which remains lost.
-- Tim Starling
Tim, an ogg file named Image:Elegie (Art Tatum).ogg is on commons. Is it the same file?
Yes. But it's only there due to an accident of a maintenance script at some time in the past. It shouldn't be on commons, because it's not under a free licence.
To avoid the need to delete orphaned media, it would be handy to have a brief list of articles where a media has been used in the past - the last five articles would usually be enough to help work out why it has been orphaned.
In this case a small amount of care by the deleting admin would have avoided the problem. It should have been obvious that the file was a useful inclusion to the article on the named artist.
Our fair use policy supports the inclusion of such music clips. I'd like to see a lot more of them on Wikipedia, since a textual description of an artist's music is always going to be clumsy and imprecise compared to a short representative clip.
-- Tim Starling
On 7/27/07, Tim Starling tstarling@wikimedia.org wrote: [snip]
In this case a small amount of care by the deleting admin would have avoided the problem. It should have been obvious that the file was a useful inclusion to the article on the named artist.
Though the deletion itself is easily undone once the error is discovered... Almost as easily as the section removal. Asking human resources to spend perhaps 10x the amount of time on an unused image just doesn't scale well, that process is already way over taxed.
The problem could also be addressed by better handling of vandalism. Not that thats an easier problem, but solving it also solves a lot of other types of wiki-rot that more care with deletion doesn't solve.
In this case the vandalism never should have stuck: The diff was obviously bad. Any experienced human looking at it would have reverted. I'd argue that the total removal of all categories is so infrequently valid that we could painlessly have a software agent autorevert, or at least bring the edit to human attention.
We're still seeing a constant stream of bad edits which could trivially be identified as bad by software sit around until a human editor finds them on their own. (http://en.wikipedia.org/w/index.php?title=Cannabis_%28drug%29&diff=prev&...)
Can someone please tell me why this is still happening? We've had the software to identify bad edits for years now, why isn't it working?
The IRC bad-edit feed that I used run went down due to technical reasons... I didn't bother putting it back up because it wasn't fun to use since people using it were slower on the draw than folks directly skimming the RC feed. ... but the fully automatic reverting bots shouldn't have a problem with non-fun work. ;)
Gregory Maxwell wrote:
On 7/27/07, Tim Starling tstarling@wikimedia.org wrote: [snip]
In this case a small amount of care by the deleting admin would have avoided the problem. It should have been obvious that the file was a useful inclusion to the article on the named artist.
Though the deletion itself is easily undone once the error is discovered... Almost as easily as the section removal. Asking human resources to spend perhaps 10x the amount of time on an unused image just doesn't scale well, that process is already way over taxed.
That wasn't the case at the time of the deletion. File archiving came in in about June 2006. Many files were deleted before that, often for specious reasons.
I know, not your point.
-- Tim Starling
On 7/28/07, Tim Starling tstarling@wikimedia.org wrote:
That wasn't the case at the time of the deletion. File archiving came in in about June 2006. Many files were deleted before that, often for specious reasons.
I know, not your point.
Ahh. I'd missed that detail. I've gone and recovered the file. Ah. Er. I see you beat me to it. ;)
In addition to snapshotting all of en's uploads at a couple of points, back then I was also saving a copy of every file I automatically tagged precisely because deletion wasn't revertible otherwise, although I never had cause to recover more than a couple of them. (Interesting th
In this case the file was even in the 2005-05-30, so I presume that even more than you and I could have recovered it.
I know, not your point either. ;)
On 7/27/07, Tim Starling tstarling@wikimedia.org wrote:
Just an example of long-term unreverted vandalism for your amusement:
http://en.wikipedia.org/w/index.php?title=Art_Tatum&diff=32294907&ol...
External links, language links, categories and a sound sample reference were deleted in December 2005 by an anon. The sound sample itself was permanently deleted a month later citing "orphaned fair use". The lost sections were progressively rebuilt by human effort over the 18 months following, except for the sound sample which remains lost.
Recent Changes currently shows the size difference of an edit. Maybe it could show differences in external links and categories as well? That kind of information is hard to come by through a tool, as a "parsed list" (read: database entries) only exist for the current revision. The only point in time where it exists for both old and new revision is just prior to removing old entries from the database, upon saving.
Magnus