[Foundation-l] Image filtering without undermining the category system - Wikimedia-l

12 Oct 2011

...

 Message: 7
 Date: Wed, 12 Oct 2011 15:36:09 +0300
 From: Jussi-Ville Heiskanen &lt;cimonavaro(a)gmail.com&gt;
 Subject: Re: [Foundation-l] Image filtering without undermining the
        category        system
 To: Wikimedia Foundation Mailing List
        &lt;foundation-l(a)lists.wikimedia.org&gt;
 Message-ID:
        &lt;CAJ9-EKLWLrvaCucF-X0SVeOEQpJVowKD=-HTLTRui4msA=atsg(a)mail.gmail.com
   Content-Type: text/plain;
charset=ISO-8859-1

 On Wed, Oct 12, 2011 at 2:24 PM, WereSpielChequers
 &lt;werespielchequers(a)gmail.com&gt; wrote:

 >> On Tue, Oct 11, 2011 at 11:55 PM, WereSpielChequers

 > I really read that with a huge deal of
thought. I keep coming to the  same
 > conclusion here that the people who don't
not only believe a workable
> system is desireable, but actively ignore the fact that what they are
> proposing is not real world workable seem to dominate the side in
> favor of some filtering scheme.
>
> Case in point: (from your proposal)
>
> "Whilst almost no-one objects to individuals making decisions as to
> what they want to see, as soon as one person decides what others on
> "their" network or IP can see you have crossed the line into enabling
> censorship. However as Wikimedia accounts are free, a logged in only
> solution would still be a free solution that was available to all."
>
> No, that is just simply not logically sound. Period. Wikipedia has no
> control over what happens to content or the formats or abilities of
> their scripts or whatever, as soon as it goes out of a intarweb pipe.
> Period. Not tenable, even if you believe a non-censorship
> enabling implementation is a good thing (I don't, but I am trying to
> address the insanity of believing that it could ever be accomplished.)
>
>
>
> The issue of whether external agencies could hack this has already come  up
 > on the talkpage.
 > http://meta.wikimedia.org/wiki/User_talk:WereSpielChequers/filter
   > The difficulty for anyone trying to
do that is that they would be
 attempting
  to read millions of pages as a logged in user
without a bot flag. So  they'd
  probably get blocked as a denial of service
attack. Even if someone
 subdivided their calls and created multiple accounts to read parts of the
 project from hundreds of different PCs they would only learn that someone
 had filtered in or out certain images. ?To replicate the filter they  would
 > need to have each of those accounts flag certain images as filtered or un
 > filtered - and at that point I would suggest that this has become a much
 > more difficult thing to hack than simply extracting some of our existing
 > categories.
   > As your the second person to raise
this I'll add an explanation to the
 > proposal as to how this can be countered.

 Do you actually have any idea what a Big Mama is, or how much brute
 computing power one of those has?

 --
 --
 Jussi-Ville Heiskanen, ~ [[User:Cimon Avaro]]

 If I wasn't somewhat aware of how Moore's law is working out in practice
then I wouldn't have come up with a system that brute force alone would
struggle to effectively crack. For a botnet to determine which images were
on a filter would be non-trivial, especially if we put a throttle in the
system to counter DOS attacks. But discovering that someone had chosen to
filter an image without knowing who had done so and whether they were
objecting to porn, spiders, military uniforms or factory farmed meat would
not be that useful to a censor. I'm confident that it would be orders of
magnitude more difficult to effectively crack this than it would be to
extract other data from our systems that could be used by a censor - such as
this list of not safe for work images
https://en.wikipedia.org/wiki/MediaWiki:Bad_image_list

Do you have any other objections to this proposal?

WereSpielChequers