[Wikimedia-l] Data mining for media archives

6 Feb 2014


      John Resig has just published some excellent data analysis combining
TinEye, image archives, and image clustering and deduplication to
identify identical and similar images across a large corpus.
http://ejohn.org/research/computer-vision-photo-archives/
Are we doing any commons analysis like this at the moment?
Is any similarity-analysis done on upload to help uploaders identify
copies of the same image that already exist online?  Or to flag
potential copyvios for reviewers?
I'm sure TinEye would be glad to give us high-volume API access to
enable that sort of cross-referencing.
SJ

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

[Wikimedia-l] Data mining for media archives