[Foundation-l] Deleting blatant copyright violations from the database
Marco Chiesa
chiesa.marco at gmail.com
Mon Aug 13 09:51:29 UTC 2007
Brian wrote:
>[0] is the addition of an abstract from the journal Nature [1]. It was in
>the encyclopedia for four months until I accidentally found it. I was told
>in IRC that the procedure for this situation is to simply remove the change
>from the current revision of the article, because it is technically
>difficult to permanently remove things from the database. This seems
>incredibly problematic to me. From a legal perspective, I don't see any
>difference in viewing an old version of an article which contains a
>copyright violation, and that copyright violation still being in the current
>version. There is some effort to hide old revisions from search engines, but
>the violation still exists on the Internet, and the copyright owner's rights
>are still being violated.
>
>
I'm always surprised at the very lax attitude that en.wikipedia has
towards copyright violations. On it.wikipedia we have a much more
draconian approach: if a potential copyright violation is present
(usually at least a sentence copied from another website) all the
versions in the history containing that bit are deleted, and, if there
are good edits in between, a note is put in the talk page with the
deleted revisions. This is sometimes an awful work for the sysop that
has to do it, since sometimes pages where a copyvio had been removed get
edited with another copyvio - the risk is that previously deleted
versions may get recovered by mistake as the only procedure to remove a
version is deleting the page and recovering the good versions. So, for
heavily edited pages, from time to time we move the old versions to
another name which we protect, so that the history is not too long (we
did it for the village pump a couple of times, than we switched to
having a page for each thread that gets included in the weekly pump). We
also have a bot (RevertBot) which checks all the edits with google and
yahoo and creates a page of suspect copyvios that a sysop will have to
check manually who copied from whom. Occasionally we discovered copyvios
on en.wikipedia that had been there for more than one year. We also had
a case of a trusted user with 30k edits, who had been sysop in the past,
that was caught copying large chunks of text from printed encyclopedias.
That forced us to set up a project to selectively remove all his
non-typos edits as suspected copyvios, which destroyed also quite a bit
of work by honest users, who had fixed his edits, added stuff that alone
didn't make sense to keep but was precious anyway.
Cruccone
More information about the foundation-l
mailing list