Message: 2 Date: Sun, 23 Oct 2011 16:36:37 +0200 From: Tobias Oelgarte tobias.oelgarte@googlemail.com Subject: Re: [Foundation-l] category free image filtering To: foundation-l@lists.wikimedia.org Message-ID: 4EA42675.9070608@googlemail.com Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Am 23.10.2011 15:46, schrieb WereSpielChequers:
Message: 3 Date: Sun, 23 Oct 2011 02:57:51 +0200 From: Tobias Oelgartetobias.oelgarte@googlemail.com Subject: Re: [Foundation-l] category free image filtering To: foundation-l@lists.wikimedia.org Message-ID:4EA3668F.5010004@googlemail.com Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Am 23.10.2011 01:49, schrieb WereSpielChequers:
Hi Tobias,
Do youhave any problems with this category free proposal http://meta.wikimedia.org/wiki/User:WereSpielChequers/filter
WereSpelChequers
The idea isn't bad. But it is based on the premise that there are enough users of the filter to build such correlations. It requires enough input to work properly and therefore enough users of the feature, that have longer lists. But how often does an average logged in user find such an image and handle accordingly? That would be relatively seldom, resulting in a very short own list, by relatively few users, which makes it hard to start the system (warm up time).
Since i love to find ways on how to exploit systems there is one simple thing on my mind. Just login to put a picture of penis/bondage/... on the list and than add another one of the football team you don't like. Repeat this step often enough and the system will believe that all users that don't like to see a penis would also not like to see images of that football team.
Another way would be: "I find everything offensive." This would hurt the system, since correlations would be much harder to find.
If we assume good faith, then it would probably work. But as soon we have spammers of this kind, it will lay in ruins, considering the amount of users and corresponding relatively short lists (in average).
Just my thoughts on this idea.
Greetings nya~
Hi Tobias,
Yes if it turned out that almost no-one used this then only the "Hide all image - recommended for users with slow internet connections" and the
"Never
show me this image again" options would be effective. My suspicion is
that
even if globally there were only a few thousand users then it would start
to
be effective on the most contentious images in popular articles in the
most
widely read versions of wikipedia (and I suspect that many of the same
image
will be used on other language versions). The more people using it the
more
effective it would be, and the more varied phobias and cultural taboos it could cater for. We have hundreds of millions of readers, if we offer
them
a free image filter then I suspect that lots will signup, but in a sense
it
doesn't matter how many do so - one of the advantages to this system is
that
when people complain about images they find offensive we will simply be
able
to respond with instructions as to how they can enable the image filter
on
their account.
I'm pretty confident that huge numbers, perhaps millions with slow
internet
connections would use the hide all images option, and that enabling them
to
do so would be an uncontentious way to further our mission by making our various products much more available in certain parts of the global
south.
As far as I'm concerned this is by far the most important part of the feature and the one that I'm most confident will be used, though it may cease to be of use in the future when and if the rest of the world has
North
American Internet speeds.
I'm not sure how spammers would try to use this, but I accept that
vandals
will try various techniques from liking penises to finding pigs and particular politicians equally objectionable. Those who simply use this
to
"like" picture of Mohammed would not be a problem, the system should
easily
be able to work out that things they liked would be disliked by another group of users. The much more clever approach of disliking both a
particular
type of porn and members of a particular football team is harder to cater for, but I'm hoping that it could be coded to recognise not just where preferences were completely unrelated, as in the people with either arachnaphobia or vertigo, or partially related as in one person having
both
arachnaphobia and vertigo. Those who find everything objectionable and
tag
thousands of images as such would easily be identified as having
dissimilar
preferences to others, as their preferences would be no more relevant to another filterer as those of an Arachnaphobe would be to a sufferer of vertigo.
Of course it's possible that there are people out there who are keen to
tag
images for others not to see. In this system there is room for them, if
your
preferences are similar to some such users then the system would pick
that
up. If your preferences are dissimilar or you don't opt in to the filter then they would have no effect on you. The system would work without such self appointed censors, but why not make use of them? I used to live with
an
Arachnaphobe, if I was still doing so I'd have no problem creating an account and tagging a few hundred images of spiders so that they and
other
Arachnaphobes would easily be able to use the image filter and the system would be able to identify those who had a similar preference to that account.
I was tempted to augment the design by giving filterers the option of
having
multiple filter lists with their own private user filter labels. This
would
complicate the user experience, but if a user had two lists, one that triggered their vertigo and the other their arachnaphobia it would then
be
much easier for the system to match them with others who shared either of their phobias. It would also be easier for the system to use either of
their
lists to assist the filters of others who shared that preference. However
it
would also give anyone who hacked into the filter database a handy key to the meaning of the various preferences, and it would put us at the top of
a
slippery slope - if the data existed then sooner or later someone would suggest looking at the user filter labels and the images that came up in them. So I thought it safest to omit that feature.
Regards
WereSpielChequers
The tagging for others is the way to exploit this system or to make it ineffective. You wrote, that you would tag images of spiders for him. This would only work if you used his account. If you would use your own, then you would just mix his preference with your own, giving the system a very different impression, while it did not learn anything about him (his account).
Another issue is the warm up time. While it could happen relatively fast (under the assumption that there are enough people that would flag the images) to collect the needed initial data for group building and relation searches, there would still be the initial task to train the system for yourself. That means that someone who has arachnophobia would have to view at an image of a spider like creature and the guts to stay until he hid it and then calmly set his preferences. Ironically he has to look directly at the spider to find that button, while other images of the same category might be represented in close proximity. A first hard task to do, even if the system has finished it's warm up period.
To start with "all hidden" as an option would give the system an nearly impossible task. While it is easy to determine what a user does not want to see, it is a completely different story to determine what he wants to see, under the premise that he does not want to be offended. That means that he will at least have to view something offending/ugly/disgusting/... at least once and to take an action the system could use to learn from.
One open problem is the so called "logic/brain of the system". Until we have an exact description on how it will exactly work, we know neither it's strong points nor it's weak spots. Until i see an algorithm that is able to solve this task, i can't really say, if I'm in favor or against the proposal.
nya~
Hi Tobias,
Yes if I were to use my own account to tag different sorts of things as objectionable then that would create a mixed list which the system would have difficulty matching usefully. But if someone were to create a sock account that they only used to tag images of spiders then it would be easy for a computer to match that with other filter lists that overlapped.
As for people needing to see images in order to decide they want to add them to their filter, no that isn't their only option.
http://meta.wikimedia.org/wiki/User:WereSpielChequers/filter#Applying_filter...
When an image is hidden the filterer will only see the caption, the alt text and an unhide button.
The unhide button will say one of the following:
1. *Click to view image that you have previously chosen to hide* 2. *Click to view image that no fellow filterer has previously checked* 3. *Click to view image that other filterers have previously chosen to hide (we estimate from your filter choices there is an x% chance that you would find this image offensive)*
and also *No thanks, this is the sort of image I don't want to see*.
If the filterer clicks on the unhide button the picture will be displayed along with a little radio button that allows the filterer to decide: *Don't show me that image again*. or *That image is OK!*
Clicking *No Thanks* or *Don't show me that image again*. or *That image is OK!* will all result in updates to that Wikimedian's filter preferences. So it would be entirely possible to tune your preferences by judging images on the caption, alt text and context.
The "Hide all images", and "Hide all images that I've tagged to hide" options would not need any matching with other editors preferences. But the other options would, so if this went ahead this functionality would need to be launched with a warning that it was in Beta test and those options either wouldn't go live or wouldn't be of much use until other editors had used the system and populated some filters.
Does that sound workable to you, and more importantly would you have any objection to it?
As you and David Gerrard have pointed out this is a very high level spec which doesn't specify how the filter lists would be matched to each other, thus far I've concentrated on the functionality and the user interface. I'm reluctant to add more detail as to how one would code this because I'm somewhat rusty and outdated in my IT skills. I'm pretty confident that it is technically feasible, but it would be helpful to have one of the devs say how big a task this would be to code and how they would do it.
One of the advantages is that some of the matching need not be in real time. Obviously the system would need to be able to make a realtime decision as to whether an image was one you had previously hidden or that had been hidden by someone who it had previously identified as having similar filters to you. But the matching of different filter lists to find which were sufficiently similar need not be instantaneous, which should reduce the hardware requirement if this eventually were to hold millions of lists some of which would list thousands of images.
WereSpielChequers
And after this procedure, we all expect, that some readers may become edtitors?
Good Luck!
I hope and expect, that wikipedia could help, that people become more educated. The more educated people are, the less important this filters will be.
this should be our goal.
not patronizing readers in advance.
h.
Am 23.10.2011 20:58, schrieb WereSpielChequers:
Message: 2 Date: Sun, 23 Oct 2011 16:36:37 +0200 From: Tobias Oelgarte tobias.oelgarte@googlemail.com Subject: Re: [Foundation-l] category free image filtering To: foundation-l@lists.wikimedia.org Message-ID: 4EA42675.9070608@googlemail.com Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Am 23.10.2011 15:46, schrieb WereSpielChequers:
Message: 3 Date: Sun, 23 Oct 2011 02:57:51 +0200 From: Tobias Oelgartetobias.oelgarte@googlemail.com Subject: Re: [Foundation-l] category free image filtering To: foundation-l@lists.wikimedia.org Message-ID:4EA3668F.5010004@googlemail.com Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Am 23.10.2011 01:49, schrieb WereSpielChequers:
Hi Tobias,
Do youhave any problems with this category free proposal http://meta.wikimedia.org/wiki/User:WereSpielChequers/filter
WereSpelChequers
The idea isn't bad. But it is based on the premise that there are enough users of the filter to build such correlations. It requires enough input to work properly and therefore enough users of the feature, that have longer lists. But how often does an average logged in user find such an image and handle accordingly? That would be relatively seldom, resulting in a very short own list, by relatively few users, which makes it hard to start the system (warm up time).
Since i love to find ways on how to exploit systems there is one simple thing on my mind. Just login to put a picture of penis/bondage/... on the list and than add another one of the football team you don't like. Repeat this step often enough and the system will believe that all users that don't like to see a penis would also not like to see images of that football team.
Another way would be: "I find everything offensive." This would hurt the system, since correlations would be much harder to find.
If we assume good faith, then it would probably work. But as soon we have spammers of this kind, it will lay in ruins, considering the amount of users and corresponding relatively short lists (in average).
Just my thoughts on this idea.
Greetings nya~
Hi Tobias,
Yes if it turned out that almost no-one used this then only the "Hide all image - recommended for users with slow internet connections" and the
"Never
show me this image again" options would be effective. My suspicion is
that
even if globally there were only a few thousand users then it would start
to
be effective on the most contentious images in popular articles in the
most
widely read versions of wikipedia (and I suspect that many of the same
image
will be used on other language versions). The more people using it the
more
effective it would be, and the more varied phobias and cultural taboos it could cater for. We have hundreds of millions of readers, if we offer
them
a free image filter then I suspect that lots will signup, but in a sense
it
doesn't matter how many do so - one of the advantages to this system is
that
when people complain about images they find offensive we will simply be
able
to respond with instructions as to how they can enable the image filter
on
their account.
I'm pretty confident that huge numbers, perhaps millions with slow
internet
connections would use the hide all images option, and that enabling them
to
do so would be an uncontentious way to further our mission by making our various products much more available in certain parts of the global
south.
As far as I'm concerned this is by far the most important part of the feature and the one that I'm most confident will be used, though it may cease to be of use in the future when and if the rest of the world has
North
American Internet speeds.
I'm not sure how spammers would try to use this, but I accept that
vandals
will try various techniques from liking penises to finding pigs and particular politicians equally objectionable. Those who simply use this
to
"like" picture of Mohammed would not be a problem, the system should
easily
be able to work out that things they liked would be disliked by another group of users. The much more clever approach of disliking both a
particular
type of porn and members of a particular football team is harder to cater for, but I'm hoping that it could be coded to recognise not just where preferences were completely unrelated, as in the people with either arachnaphobia or vertigo, or partially related as in one person having
both
arachnaphobia and vertigo. Those who find everything objectionable and
tag
thousands of images as such would easily be identified as having
dissimilar
preferences to others, as their preferences would be no more relevant to another filterer as those of an Arachnaphobe would be to a sufferer of vertigo.
Of course it's possible that there are people out there who are keen to
tag
images for others not to see. In this system there is room for them, if
your
preferences are similar to some such users then the system would pick
that
up. If your preferences are dissimilar or you don't opt in to the filter then they would have no effect on you. The system would work without such self appointed censors, but why not make use of them? I used to live with
an
Arachnaphobe, if I was still doing so I'd have no problem creating an account and tagging a few hundred images of spiders so that they and
other
Arachnaphobes would easily be able to use the image filter and the system would be able to identify those who had a similar preference to that account.
I was tempted to augment the design by giving filterers the option of
having
multiple filter lists with their own private user filter labels. This
would
complicate the user experience, but if a user had two lists, one that triggered their vertigo and the other their arachnaphobia it would then
be
much easier for the system to match them with others who shared either of their phobias. It would also be easier for the system to use either of
their
lists to assist the filters of others who shared that preference. However
it
would also give anyone who hacked into the filter database a handy key to the meaning of the various preferences, and it would put us at the top of
a
slippery slope - if the data existed then sooner or later someone would suggest looking at the user filter labels and the images that came up in them. So I thought it safest to omit that feature.
Regards
WereSpielChequers
The tagging for others is the way to exploit this system or to make it ineffective. You wrote, that you would tag images of spiders for him. This would only work if you used his account. If you would use your own, then you would just mix his preference with your own, giving the system a very different impression, while it did not learn anything about him (his account).
Another issue is the warm up time. While it could happen relatively fast (under the assumption that there are enough people that would flag the images) to collect the needed initial data for group building and relation searches, there would still be the initial task to train the system for yourself. That means that someone who has arachnophobia would have to view at an image of a spider like creature and the guts to stay until he hid it and then calmly set his preferences. Ironically he has to look directly at the spider to find that button, while other images of the same category might be represented in close proximity. A first hard task to do, even if the system has finished it's warm up period.
To start with "all hidden" as an option would give the system an nearly impossible task. While it is easy to determine what a user does not want to see, it is a completely different story to determine what he wants to see, under the premise that he does not want to be offended. That means that he will at least have to view something offending/ugly/disgusting/... at least once and to take an action the system could use to learn from.
One open problem is the so called "logic/brain of the system". Until we have an exact description on how it will exactly work, we know neither it's strong points nor it's weak spots. Until i see an algorithm that is able to solve this task, i can't really say, if I'm in favor or against the proposal.
nya~
Hi Tobias,
Yes if I were to use my own account to tag different sorts of things as objectionable then that would create a mixed list which the system would have difficulty matching usefully. But if someone were to create a sock account that they only used to tag images of spiders then it would be easy for a computer to match that with other filter lists that overlapped.
As for people needing to see images in order to decide they want to add them to their filter, no that isn't their only option.
http://meta.wikimedia.org/wiki/User:WereSpielChequers/filter#Applying_filter...
When an image is hidden the filterer will only see the caption, the alt text and an unhide button.
The unhide button will say one of the following:
- *Click to view image that you have previously chosen to hide*
- *Click to view image that no fellow filterer has previously checked*
- *Click to view image that other filterers have previously chosen to
hide (we estimate from your filter choices there is an x% chance that you would find this image offensive)*
and also *No thanks, this is the sort of image I don't want to see*.
If the filterer clicks on the unhide button the picture will be displayed along with a little radio button that allows the filterer to decide: *Don't show me that image again*. or *That image is OK!*
Clicking *No Thanks* or *Don't show me that image again*. or *That image is OK!* will all result in updates to that Wikimedian's filter preferences. So it would be entirely possible to tune your preferences by judging images on the caption, alt text and context.
The "Hide all images", and "Hide all images that I've tagged to hide" options would not need any matching with other editors preferences. But the other options would, so if this went ahead this functionality would need to be launched with a warning that it was in Beta test and those options either wouldn't go live or wouldn't be of much use until other editors had used the system and populated some filters.
Does that sound workable to you, and more importantly would you have any objection to it?
As you and David Gerrard have pointed out this is a very high level spec which doesn't specify how the filter lists would be matched to each other, thus far I've concentrated on the functionality and the user interface. I'm reluctant to add more detail as to how one would code this because I'm somewhat rusty and outdated in my IT skills. I'm pretty confident that it is technically feasible, but it would be helpful to have one of the devs say how big a task this would be to code and how they would do it.
One of the advantages is that some of the matching need not be in real time. Obviously the system would need to be able to make a realtime decision as to whether an image was one you had previously hidden or that had been hidden by someone who it had previously identified as having similar filters to you. But the matching of different filter lists to find which were sufficiently similar need not be instantaneous, which should reduce the hardware requirement if this eventually were to hold millions of lists some of which would list thousands of images.
WereSpielChequers _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
wikimedia-l@lists.wikimedia.org