> Message: 4
> Date: Thu, 4 Dec 2014 14:58:37 -0500
> From: "Sreejith K." <sreejithk2000(a)gmail.com>
> To: Wikimedia Commons Discussion List <commons-l(a)lists.wikimedia.org>
> Subject: Re: [Commons-l] Duplicate removal?
> Content-Type: text/plain;
> I am using Wikimedia APIs to create a gallery of duplicates and
You can see the results here.
The page also has a link to the script. If anyone is interested in using
this script, let me know and I can work with you to customize it.
- Sreejith K.
See also https://commons.wikimedia.org/wiki/Special:ListDuplicatedFiles
which lists files that have the most byte for byte duplicates (really most
of the time those should use file redirects).
Thanks Jonas for experimenting with this sort of thing. I always wished we
did something with preceptual hashes internally in addition to the sha1
hashes we do currently.