On 09/21/2011 03:47 AM, Milos Rancic wrote:
- Create en.safe.wikipedia.org […]
Then governments/ISPs/institutions could block unsafe-Wikipedia via DNS blocks. This is, compared to DPI, quite easy. Using en.wikipedia.org/safe/ might resolve this issue.
- Create safe.wikimedia.org. That would be the site for
censoring/categorizing Commons images. It shouldn't be Commons itself, but its virtual fork. The fork would be consisted of hashes of image names with images themselves. Thus, image on Commons with the name "Torre_de_H%C3%A9rcules_-_DivesGallaecia2012-62.jpg" would be "fd37dae713526ee2da82f5a6cf6431de.jpg" on safe.wikimedia.org. The image preview located on upload.wikimedia.org with the name "thumb/8/80/Torre_de_H%C3%A9rcules_-_DivesGallaecia2012-62.jpg/800px-Torre_de_H%C3%A9rcules_-_DivesGallaecia2012-62.jpg"; it would be translated as "thumb/a1f3216e3344ea115bcac778937947f1.jpg" on safe.wikimedia.org. (Note: md5 is not likely to be the best hashing system; some other algorithm could be deployed.)
You're counting on there being too many hashes to go through, which is correct. But there are far fewer images to go through. You'd only have to create a list of all hashes of all 11 million or so images on Commons and compare that list to the list of unsafe images on safe.wikimedia.org. Which is not easy (if you have to download all the files, i.e. if the files themselves are used for hashing, not only the file name), but arguably doable.
So, in effect, I don't think your proposal properly achieves what it tries to accomplish. (Sorry if I misunderstood your proposal)
-- Tobias