[Foundation-l] Image filtering without undermining the category system
werespielchequers at gmail.com
Wed Oct 12 13:50:21 UTC 2011
> Message: 7
> Date: Wed, 12 Oct 2011 15:36:09 +0300
> From: Jussi-Ville Heiskanen <cimonavaro at gmail.com>
> Subject: Re: [Foundation-l] Image filtering without undermining the
> category system
> To: Wikimedia Foundation Mailing List
> <foundation-l at lists.wikimedia.org>
> <CAJ9-EKLWLrvaCucF-X0SVeOEQpJVowKD=-HTLTRui4msA=atsg at mail.gmail.com
> Content-Type: text/plain; charset=ISO-8859-1
> On Wed, Oct 12, 2011 at 2:24 PM, WereSpielChequers
> <werespielchequers at gmail.com> wrote:
> >> On Tue, Oct 11, 2011 at 11:55 PM, WereSpielChequers
> >> I really read that with a huge deal of thought. I keep coming to the
> >> conclusion here that the people who don't not only believe a workable
> >> system is desireable, but actively ignore the fact that what they are
> >> proposing is not real world workable seem to dominate the side in
> >> favor of some filtering scheme.
> >> Case in point: (from your proposal)
> >> "Whilst almost no-one objects to individuals making decisions as to
> >> what they want to see, as soon as one person decides what others on
> >> "their" network or IP can see you have crossed the line into enabling
> >> censorship. However as Wikimedia accounts are free, a logged in only
> >> solution would still be a free solution that was available to all."
> >> No, that is just simply not logically sound. Period. Wikipedia has no
> >> control over what happens to content or the formats or abilities of
> >> their scripts or whatever, as soon as it goes out of a intarweb pipe.
> >> Period. Not tenable, even if you believe a non-censorship
> >> enabling implementation is a good thing (I don't, but I am trying to
> >> address the insanity of believing that it could ever be accomplished.)
> >> The issue of whether external agencies could hack this has already come
> > on the talkpage.
> > http://meta.wikimedia.org/wiki/User_talk:WereSpielChequers/filter
> > The difficulty for anyone trying to do that is that they would be
> > to read millions of pages as a logged in user without a bot flag. So
> > probably get blocked as a denial of service attack. Even if someone
> > subdivided their calls and created multiple accounts to read parts of the
> > project from hundreds of different PCs they would only learn that someone
> > had filtered in or out certain images. ?To replicate the filter they
> > need to have each of those accounts flag certain images as filtered or un
> > filtered - and at that point I would suggest that this has become a much
> > more difficult thing to hack than simply extracting some of our existing
> > categories.
> > As your the second person to raise this I'll add an explanation to the
> > proposal as to how this can be countered.
> Do you actually have any idea what a Big Mama is, or how much brute
> computing power one of those has?
> Jussi-Ville Heiskanen, ~ [[User:Cimon Avaro]]
If I wasn't somewhat aware of how Moore's law is working out in practice
then I wouldn't have come up with a system that brute force alone would
struggle to effectively crack. For a botnet to determine which images were
on a filter would be non-trivial, especially if we put a throttle in the
system to counter DOS attacks. But discovering that someone had chosen to
filter an image without knowing who had done so and whether they were
objecting to porn, spiders, military uniforms or factory farmed meat would
not be that useful to a censor. I'm confident that it would be orders of
magnitude more difficult to effectively crack this than it would be to
extract other data from our systems that could be used by a censor - such as
this list of not safe for work images
Do you have any other objections to this proposal?
More information about the foundation-l