This sounds like an interesting potential approach to deal with spambots, and hopefully to deter the people who make them.
https://techcrunch.com/2019/02/05/kasada-bots/
I don't know how practical it would be to implement an approach like this in the Wikiverse, and whether licensing proprietary technology would be required.
I would be interested in decreasing the quantity and effectiveness of spambots that misuse WMF infrastructure, damage the quality of Wikimedia content, and drain significant cumulative time from the limited supply of good faith contributors.
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach like this in the Wikiverse, and whether licensing proprietary technology would be required.
They are talking about Polyform [1], a reverse proxy that filters traffic with a combination of browser fingerprinting, behavior analysis and proof of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users with no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise) but it would be a huge undertaking.
OK. Yesterday I was looking with a few other ENWP people at what I think was a series of edits by either a vandal bot or an inadequately designed and unapproved good faith bot. I read that it made approximately 500 edits before someone who knew enough about ENWP saw what was happening and did something about it. I don't know how many problematic bots we have, in addition to vandal bots, but I am confident that they drain a nontrivial amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and stopping unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for WMF to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach like this in the Wikiverse, and whether licensing proprietary technology would be required.
They are talking about Polyform [1], a reverse proxy that filters traffic with a combination of browser fingerprinting, behavior analysis and proof of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users with no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise) but it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
To clarify the types of unwelcome bots that we have, here are the ones that I think are most common:
1) Spambots
2) Vandalbots
3) Unauthorized bots which may be intended to act in good faith but which may cause problems that could probably have been identified during standard testing in Wikimedia communities which have a relatively well developed bot approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in which case I could add this subject into a backlog. I am sorry if I sound grumpy at WMF regarding this subject; this is a problem but I know that there are millions of problems and I don't expect a different project to be dropped in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am not amused by unwelcome bots causing problems on ENWP or anywhere else in the Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I think was a series of edits by either a vandal bot or an inadequately designed and unapproved good faith bot. I read that it made approximately 500 edits before someone who knew enough about ENWP saw what was happening and did something about it. I don't know how many problematic bots we have, in addition to vandal bots, but I am confident that they drain a nontrivial amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and stopping unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for WMF to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach like
this
in the Wikiverse, and whether licensing proprietary technology would be required.
They are talking about Polyform [1], a reverse proxy that filters traffic with a combination of browser fingerprinting, behavior analysis and proof of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users with no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise) but it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Sure its certainly a front we can do better on.
I don't think Kasada is a product that's appropriate at this time. Ignoring the ideological aspect of it being non-free software, there's a lot of easy things we could and should try first.
However, I'd caution against viewing this as purely a technical problem. Wikimedia is not like other websites - we have allowable bots. For many commercial websites, the only good bot is a dead bot. Wikimedia has many good bots. On enwiki usually they have to be approved, I don't think that's true on all wikis. We also consider it perfectly ok to do limited testing of bots before it is approved. We also encourage the creation of alternative "clients", which from a server perspective looks like a bot. Unlike other websites where anything non-human is evil, here we need to ensure our blocking corresponds to social norms of the community. This may sound not that hard, but I think it complicates botblocking more than is obvious at first glance.
Second, this sort of thing is something that tends to far through the cracks at WMF. AFAIK the last time there was a team responsible for admin tools & anti-abuse was 2013 ( https://www.mediawiki.org/wiki/Admin_tools_development). I believe (correct me if I'm wrong) that anti-harrasment team is all about human harassment and not anti-abuse in this sense. Security is adjacent to this problem, but traditionally has not considered this problem in scope. Even core tools like checkuser have been largely ignored by the foundation for many many years.
I guess this is a long winded way of saying - I think there should be a team responsible for this sort of stuff at WMF, but there isn't one. I think there's a lot of rather easy things we can try (Off the top of my head: Better captchas. More adaptive rate limits that adjust based on how evilish you look, etc), but they definitely require close involvement with the community to ensure that we do the actual right thing.
-- Brian (p.s. Consider this a volunteer hat email)
On Sun, Feb 10, 2019 at 6:06 AM Pine W wiki.pine@gmail.com wrote:
To clarify the types of unwelcome bots that we have, here are the ones that I think are most common:
Spambots
Vandalbots
Unauthorized bots which may be intended to act in good faith but which
may cause problems that could probably have been identified during standard testing in Wikimedia communities which have a relatively well developed bot approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in which case I could add this subject into a backlog. I am sorry if I sound grumpy at WMF regarding this subject; this is a problem but I know that there are millions of problems and I don't expect a different project to be dropped in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am not amused by unwelcome bots causing problems on ENWP or anywhere else in the Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I think was a series of edits by either a vandal bot or an inadequately designed and unapproved good faith bot. I read that it made approximately 500
edits
before someone who knew enough about ENWP saw what was happening and did something about it. I don't know how many problematic bots we have, in addition to vandal bots, but I am confident that they drain a nontrivial amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and stopping unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for WMF to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach like
this
in the Wikiverse, and whether licensing proprietary technology would
be
required.
They are talking about Polyform [1], a reverse proxy that filters
traffic
with a combination of browser fingerprinting, behavior analysis and
proof
of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users with no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise)
but
it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This may be naive, but... isn't the wishlist filling this need? And if not through a consensus-driven method like the wishlist, how should a WMF team prioritize which power user tools it needs to focus on?
Or is just a matter of "Yes, wishlist, but more of it"?
- Jonathan
On Mon, Feb 11, 2019 at 2:34 AM bawolff bawolff+wn@gmail.com wrote:
Sure its certainly a front we can do better on.
I don't think Kasada is a product that's appropriate at this time. Ignoring the ideological aspect of it being non-free software, there's a lot of easy things we could and should try first.
However, I'd caution against viewing this as purely a technical problem. Wikimedia is not like other websites - we have allowable bots. For many commercial websites, the only good bot is a dead bot. Wikimedia has many good bots. On enwiki usually they have to be approved, I don't think that's true on all wikis. We also consider it perfectly ok to do limited testing of bots before it is approved. We also encourage the creation of alternative "clients", which from a server perspective looks like a bot. Unlike other websites where anything non-human is evil, here we need to ensure our blocking corresponds to social norms of the community. This may sound not that hard, but I think it complicates botblocking more than is obvious at first glance.
Second, this sort of thing is something that tends to far through the cracks at WMF. AFAIK the last time there was a team responsible for admin tools & anti-abuse was 2013 ( https://www.mediawiki.org/wiki/Admin_tools_development). I believe (correct me if I'm wrong) that anti-harrasment team is all about human harassment and not anti-abuse in this sense. Security is adjacent to this problem, but traditionally has not considered this problem in scope. Even core tools like checkuser have been largely ignored by the foundation for many many years.
I guess this is a long winded way of saying - I think there should be a team responsible for this sort of stuff at WMF, but there isn't one. I think there's a lot of rather easy things we can try (Off the top of my head: Better captchas. More adaptive rate limits that adjust based on how evilish you look, etc), but they definitely require close involvement with the community to ensure that we do the actual right thing.
-- Brian (p.s. Consider this a volunteer hat email)
On Sun, Feb 10, 2019 at 6:06 AM Pine W wiki.pine@gmail.com wrote:
To clarify the types of unwelcome bots that we have, here are the ones
that
I think are most common:
Spambots
Vandalbots
Unauthorized bots which may be intended to act in good faith but which
may cause problems that could probably have been identified during
standard
testing in Wikimedia communities which have a relatively well developed
bot
approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in which case I could add this subject into a backlog. I am sorry if I sound
grumpy
at WMF regarding this subject; this is a problem but I know that there
are
millions of problems and I don't expect a different project to be dropped in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am not amused by unwelcome bots causing problems on ENWP or anywhere else in the Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I
think
was a series of edits by either a vandal bot or an inadequately
designed
and unapproved good faith bot. I read that it made approximately 500
edits
before someone who knew enough about ENWP saw what was happening and
did
something about it. I don't know how many problematic bots we have, in addition to vandal bots, but I am confident that they drain a
nontrivial
amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and
stopping
unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for
WMF
to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org
wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach like
this
in the Wikiverse, and whether licensing proprietary technology would
be
required.
They are talking about Polyform [1], a reverse proxy that filters
traffic
with a combination of browser fingerprinting, behavior analysis and
proof
of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users
with
no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise)
but
it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
We've been working on unflagged bot detection on my team. It's far from a real product integration, but we have shown that it works in practice. We tested this in Wikidata, but I don't see a good reason why a similar strategy wouldn't work for English Wikipedia.
Hall, A., Terveen, L., & Halfaker, A. (2018). Bot Detection in Wikidata Using Behavioral and Other Informal Cues. *Proceedings of the ACM on Human-Computer Interaction*, *2*(CSCW), 64. pdf https://dl.acm.org/ft_gateway.cfm?id=3274333&type=pdf
In theory, we could get this into ORES if there was strong demand. As Pine points out, we'd need to delay some other projects. For reference, the next thing on the backlog that I'm looking at is setting article quality prediction for Swedish Wikipedia.
-Aaron
On Mon, Feb 11, 2019 at 11:19 AM Jonathan Morgan jmorgan@wikimedia.org wrote:
This may be naive, but... isn't the wishlist filling this need? And if not through a consensus-driven method like the wishlist, how should a WMF team prioritize which power user tools it needs to focus on?
Or is just a matter of "Yes, wishlist, but more of it"?
- Jonathan
On Mon, Feb 11, 2019 at 2:34 AM bawolff bawolff+wn@gmail.com wrote:
Sure its certainly a front we can do better on.
I don't think Kasada is a product that's appropriate at this time.
Ignoring
the ideological aspect of it being non-free software, there's a lot of
easy
things we could and should try first.
However, I'd caution against viewing this as purely a technical problem. Wikimedia is not like other websites - we have allowable bots. For many commercial websites, the only good bot is a dead bot. Wikimedia has many good bots. On enwiki usually they have to be approved, I don't think
that's
true on all wikis. We also consider it perfectly ok to do limited testing of bots before it is approved. We also encourage the creation of alternative "clients", which from a server perspective looks like a bot. Unlike other websites where anything non-human is evil, here we need to ensure our blocking corresponds to social norms of the community. This
may
sound not that hard, but I think it complicates botblocking more than is obvious at first glance.
Second, this sort of thing is something that tends to far through the cracks at WMF. AFAIK the last time there was a team responsible for admin tools & anti-abuse was 2013 ( https://www.mediawiki.org/wiki/Admin_tools_development). I believe (correct me if I'm wrong) that anti-harrasment team is all about human harassment and not anti-abuse in this sense. Security is adjacent to this problem,
but
traditionally has not considered this problem in scope. Even core tools like checkuser have been largely ignored by the foundation for many many years.
I guess this is a long winded way of saying - I think there should be a team responsible for this sort of stuff at WMF, but there isn't one. I think there's a lot of rather easy things we can try (Off the top of my head: Better captchas. More adaptive rate limits that adjust based on how evilish you look, etc), but they definitely require close involvement
with
the community to ensure that we do the actual right thing.
-- Brian (p.s. Consider this a volunteer hat email)
On Sun, Feb 10, 2019 at 6:06 AM Pine W wiki.pine@gmail.com wrote:
To clarify the types of unwelcome bots that we have, here are the ones
that
I think are most common:
Spambots
Vandalbots
Unauthorized bots which may be intended to act in good faith but
which
may cause problems that could probably have been identified during
standard
testing in Wikimedia communities which have a relatively well developed
bot
approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in which case I could add this subject into a backlog. I am sorry if I sound
grumpy
at WMF regarding this subject; this is a problem but I know that there
are
millions of problems and I don't expect a different project to be
dropped
in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am
not
amused by unwelcome bots causing problems on ENWP or anywhere else in
the
Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I
think
was a series of edits by either a vandal bot or an inadequately
designed
and unapproved good faith bot. I read that it made approximately 500
edits
before someone who knew enough about ENWP saw what was happening and
did
something about it. I don't know how many problematic bots we have,
in
addition to vandal bots, but I am confident that they drain a
nontrivial
amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and
stopping
unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for
WMF
to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org
wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach
like
this
in the Wikiverse, and whether licensing proprietary technology
would
be
required.
They are talking about Polyform [1], a reverse proxy that filters
traffic
with a combination of browser fingerprinting, behavior analysis and
proof
of work. Proof of work is not really useful unless you have huge levels of
bot
traffic from a single bot operator (also it means locking out users
with
no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions
privacy-wise)
but
it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jonathan T. Morgan Senior Design Researcher Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
It is extremely easy to detect a bot unless the bot operator chose to make it hard. Just make a model for how the user interacts with the input devices, and do anomaly detection. That imply use of Javascript though, but users not using JS are either very dubious or quite well-known. There are nearly no new users that does not use JS.
Reused a previous tex-file, and did not clean it up? "Magnetic Normal Modes of Bi-Component Permalloy Structures" ;)
On Mon, Feb 11, 2019 at 6:47 PM Aaron Halfaker ahalfaker@wikimedia.org wrote:
We've been working on unflagged bot detection on my team. It's far from a real product integration, but we have shown that it works in practice. We tested this in Wikidata, but I don't see a good reason why a similar strategy wouldn't work for English Wikipedia.
Hall, A., Terveen, L., & Halfaker, A. (2018). Bot Detection in Wikidata Using Behavioral and Other Informal Cues. *Proceedings of the ACM on Human-Computer Interaction*, *2*(CSCW), 64.
https://dl.acm.org/ft_gateway.cfm?id=3274333&type=pdf
In theory, we could get this into ORES if there was strong demand. As
Pine
points out, we'd need to delay some other projects. For reference, the next thing on the backlog that I'm looking at is setting article quality prediction for Swedish Wikipedia.
-Aaron
On Mon, Feb 11, 2019 at 11:19 AM Jonathan Morgan jmorgan@wikimedia.org wrote:
This may be naive, but... isn't the wishlist filling this need? And if
not
through a consensus-driven method like the wishlist, how should a WMF
team
prioritize which power user tools it needs to focus on?
Or is just a matter of "Yes, wishlist, but more of it"?
- Jonathan
On Mon, Feb 11, 2019 at 2:34 AM bawolff bawolff+wn@gmail.com wrote:
Sure its certainly a front we can do better on.
I don't think Kasada is a product that's appropriate at this time.
Ignoring
the ideological aspect of it being non-free software, there's a lot of
easy
things we could and should try first.
However, I'd caution against viewing this as purely a technical
problem.
Wikimedia is not like other websites - we have allowable bots. For
many
commercial websites, the only good bot is a dead bot. Wikimedia has
many
good bots. On enwiki usually they have to be approved, I don't think
that's
true on all wikis. We also consider it perfectly ok to do limited
testing
of bots before it is approved. We also encourage the creation of alternative "clients", which from a server perspective looks like a
bot.
Unlike other websites where anything non-human is evil, here we need
to
ensure our blocking corresponds to social norms of the community. This
may
sound not that hard, but I think it complicates botblocking more than
is
obvious at first glance.
Second, this sort of thing is something that tends to far through the cracks at WMF. AFAIK the last time there was a team responsible for
admin
tools & anti-abuse was 2013 ( https://www.mediawiki.org/wiki/Admin_tools_development). I believe (correct me if I'm wrong) that anti-harrasment team is all about human
harassment
and not anti-abuse in this sense. Security is adjacent to this
problem,
but
traditionally has not considered this problem in scope. Even core
tools
like checkuser have been largely ignored by the foundation for many
many
years.
I guess this is a long winded way of saying - I think there should be
a
team responsible for this sort of stuff at WMF, but there isn't one. I think there's a lot of rather easy things we can try (Off the top of
my
head: Better captchas. More adaptive rate limits that adjust based on
how
evilish you look, etc), but they definitely require close involvement
with
the community to ensure that we do the actual right thing.
-- Brian (p.s. Consider this a volunteer hat email)
On Sun, Feb 10, 2019 at 6:06 AM Pine W wiki.pine@gmail.com wrote:
To clarify the types of unwelcome bots that we have, here are the
ones
that
I think are most common:
Spambots
Vandalbots
Unauthorized bots which may be intended to act in good faith but
which
may cause problems that could probably have been identified during
standard
testing in Wikimedia communities which have a relatively well
developed
bot
approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in
which
case I could add this subject into a backlog. I am sorry if I sound
grumpy
at WMF regarding this subject; this is a problem but I know that
there
are
millions of problems and I don't expect a different project to be
dropped
in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am
not
amused by unwelcome bots causing problems on ENWP or anywhere else
in
the
Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I
think
was a series of edits by either a vandal bot or an inadequately
designed
and unapproved good faith bot. I read that it made approximately
500
edits
before someone who knew enough about ENWP saw what was happening
and
did
something about it. I don't know how many problematic bots we
have,
in
addition to vandal bots, but I am confident that they drain a
nontrivial
amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and
stopping
unwelcome bots, but I think that the question of how to decrease
the
numbers and effectiveness of unwelcome bots would be a good topic
for
WMF
to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org
wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com
wrote:
> I don't know how practical it would be to implement an approach
like
this > in the Wikiverse, and whether licensing proprietary technology
would
be
> required. >
They are talking about Polyform [1], a reverse proxy that filters
traffic
with a combination of browser fingerprinting, behavior analysis
and
proof
of work. Proof of work is not really useful unless you have huge levels of
bot
traffic from a single bot operator (also it means locking out
users
with
no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it
ourselves
(although it would still bring up interesting questions
privacy-wise)
but
it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jonathan T. Morgan Senior Design Researcher Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Aaron Halfaker
Principal Research Scientist
Head of the Scoring Platform team Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Stewards are just 34 people and are not enough to be a big voting power at the wishlist like enwiki people. What we actually need cannot get it thru that way.
-- Yongmin Sent from my iPhone
Text licensed under CC BY ND 2.0 KR Please note that this address is list-only address and any non-mailing list mails will be treated as spam. Please use https://encrypt.to/0x947f156f16250de39788c3c35b625da5beff197a
2019. 2. 12. 02:18, Jonathan Morgan jmorgan@wikimedia.org 작성:
This may be naive, but... isn't the wishlist filling this need? And if not through a consensus-driven method like the wishlist, how should a WMF team prioritize which power user tools it needs to focus on?
Or is just a matter of "Yes, wishlist, but more of it"?
- Jonathan
On Mon, Feb 11, 2019 at 2:34 AM bawolff bawolff+wn@gmail.com wrote:
Sure its certainly a front we can do better on.
I don't think Kasada is a product that's appropriate at this time. Ignoring the ideological aspect of it being non-free software, there's a lot of easy things we could and should try first.
However, I'd caution against viewing this as purely a technical problem. Wikimedia is not like other websites - we have allowable bots. For many commercial websites, the only good bot is a dead bot. Wikimedia has many good bots. On enwiki usually they have to be approved, I don't think that's true on all wikis. We also consider it perfectly ok to do limited testing of bots before it is approved. We also encourage the creation of alternative "clients", which from a server perspective looks like a bot. Unlike other websites where anything non-human is evil, here we need to ensure our blocking corresponds to social norms of the community. This may sound not that hard, but I think it complicates botblocking more than is obvious at first glance.
Second, this sort of thing is something that tends to far through the cracks at WMF. AFAIK the last time there was a team responsible for admin tools & anti-abuse was 2013 ( https://www.mediawiki.org/wiki/Admin_tools_development). I believe (correct me if I'm wrong) that anti-harrasment team is all about human harassment and not anti-abuse in this sense. Security is adjacent to this problem, but traditionally has not considered this problem in scope. Even core tools like checkuser have been largely ignored by the foundation for many many years.
I guess this is a long winded way of saying - I think there should be a team responsible for this sort of stuff at WMF, but there isn't one. I think there's a lot of rather easy things we can try (Off the top of my head: Better captchas. More adaptive rate limits that adjust based on how evilish you look, etc), but they definitely require close involvement with the community to ensure that we do the actual right thing.
-- Brian (p.s. Consider this a volunteer hat email)
On Sun, Feb 10, 2019 at 6:06 AM Pine W wiki.pine@gmail.com wrote:
To clarify the types of unwelcome bots that we have, here are the ones
that
I think are most common:
Spambots
Vandalbots
Unauthorized bots which may be intended to act in good faith but which
may cause problems that could probably have been identified during
standard
testing in Wikimedia communities which have a relatively well developed
bot
approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in which case I could add this subject into a backlog. I am sorry if I sound
grumpy
at WMF regarding this subject; this is a problem but I know that there
are
millions of problems and I don't expect a different project to be dropped in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am not amused by unwelcome bots causing problems on ENWP or anywhere else in the Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I
think
was a series of edits by either a vandal bot or an inadequately
designed
and unapproved good faith bot. I read that it made approximately 500
edits
before someone who knew enough about ENWP saw what was happening and
did
something about it. I don't know how many problematic bots we have, in addition to vandal bots, but I am confident that they drain a
nontrivial
amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and
stopping
unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for
WMF
to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org
wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach like
this
in the Wikiverse, and whether licensing proprietary technology would
be
required.
They are talking about Polyform [1], a reverse proxy that filters
traffic
with a combination of browser fingerprinting, behavior analysis and
proof
of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users
with
no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise)
but
it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jonathan T. Morgan Senior Design Researcher Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thanks for the replies.
I think that detailed discussion of the pros and cons of the Tech Wishlist should be separate from this thread, but I agree that one way to get a subject like unflagged bot detection addressed could be through the Tech Wishlist assuming that WMF is willing to devote resources to that topic if it ranked in the top X places.
It sounds like there are a few different ways that work in this area could be resourced:
1. As mentioned above, making it be a tech wishlist item and having Community Tech work on it; 2. Having the Anti-Harrassment Tools team work on it; 3. Having the Security team work on it; 4. Having the ORES team work on it; 5. Funding work through a WMF grants program; 6. Funding through a mentorship program like GSOC. I believe that GSOC previously supported work on CAPTCHA improvements.
Of the above options I suggest first considering 2 and 4. Having AHAT staff work on unflagged bot detection might be scope creep under the existing AHAT charter but perhaps AHAT's charter could be modified into something that would resemble the charter for an "Administrators' Tools Team". And if the ORES team has already done some work on unflagged bot detection then perhaps ORES and AHAT staff could collaborate on this topic.
In the first half of the next WMF fiscal year, I think that planning for an existing WMF team or combination of staff from existing teams to work on unflagged bot detection would be good. If WMF does not resource this topic, then if community people want unflagged bot detection be resourced, we can consider other options such as 1 and 5.
On Mon, Feb 11, 2019 at 5:19 PM Pine W wiki.pine@gmail.com wrote:
Thanks for the replies.
I think that detailed discussion of the pros and cons of the Tech Wishlist should be separate from this thread, but I agree that one way to get a subject like unflagged bot detection addressed could be through the Tech Wishlist assuming that WMF is willing to devote resources to that topic if it ranked in the top X places.
It sounds like there are a few different ways that work in this area could be resourced:
- As mentioned above, making it be a tech wishlist item and having
Community Tech work on it; 2. Having the Anti-Harrassment Tools team work on it; 3. Having the Security team work on it; 4. Having the ORES team work on it; 5. Funding work through a WMF grants program; 6. Funding through a mentorship program like GSOC. I believe that GSOC previously supported work on CAPTCHA improvements.
Of the above options I suggest first considering 2 and 4. Having AHAT staff work on unflagged bot detection might be scope creep under the existing AHAT charter but perhaps AHAT's charter could be modified into something that would resemble the charter for an "Administrators' Tools Team". And if the ORES team has already done some work on unflagged bot detection then perhaps ORES and AHAT staff could collaborate on this topic.
In the first half of the next WMF fiscal year, I think that planning for an existing WMF team or combination of staff from existing teams to work on unflagged bot detection would be good. If WMF does not resource this topic, then if community people want unflagged bot detection be resourced, we can consider other options such as 1 and 5.
Pine ( https://meta.wikimedia.org/wiki/User:Pine ) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi David, do you have a question? I saw the GIF but I don't know how to interpret it in the context of this conversation.
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Tue, Feb 12, 2019 at 5:49 AM David Barratt dbarratt@wikimedia.org wrote:
Couple thoughts:
1. ORES platform (ores.wikimedia.org) was designed to host a wide range of machine learning models, not just the ones built by Aaron Halfaker himself. So, if there is a computer scientist out there who is interested in training and maintaining a new bot-detection model, it can be hosted on and surfaced through ORES. Then anyone with some bot- or web-development skills can build tools on top of that model. Noting this because that's one of the main points of having a "scoring platform": it separates the (necessarily WMF-led) work of production platform development from the development of purpose-built tools. 2. If anyone knows a computer scientist who is interested in developing and piloting a model like this please send them our way. Members of the Research team, or Aaron, *may* have capacity to support a formal collaboration 3. This seems way too complex for a GSOC project to me, but I'd love to be wrong about that. If there are students who are interested in working on this, please send them our way (no promises, obvs). 4. Modifying the charter of an existing WMF product team seems somewhat out of scope for this ask, task, and venue. :)
- J
On Mon, Feb 11, 2019 at 2:19 PM Pine W wiki.pine@gmail.com wrote:
Thanks for the replies.
I think that detailed discussion of the pros and cons of the Tech Wishlist should be separate from this thread, but I agree that one way to get a subject like unflagged bot detection addressed could be through the Tech Wishlist assuming that WMF is willing to devote resources to that topic if it ranked in the top X places.
It sounds like there are a few different ways that work in this area could be resourced:
- As mentioned above, making it be a tech wishlist item and having
Community Tech work on it; 2. Having the Anti-Harrassment Tools team work on it; 3. Having the Security team work on it; 4. Having the ORES team work on it; 5. Funding work through a WMF grants program; 6. Funding through a mentorship program like GSOC. I believe that GSOC previously supported work on CAPTCHA improvements.
Of the above options I suggest first considering 2 and 4. Having AHAT staff work on unflagged bot detection might be scope creep under the existing AHAT charter but perhaps AHAT's charter could be modified into something that would resemble the charter for an "Administrators' Tools Team". And if the ORES team has already done some work on unflagged bot detection then perhaps ORES and AHAT staff could collaborate on this topic.
In the first half of the next WMF fiscal year, I think that planning for an existing WMF team or combination of staff from existing teams to work on unflagged bot detection would be good. If WMF does not resource this topic, then if community people want unflagged bot detection be resourced, we can consider other options such as 1 and 5.
Pine ( https://meta.wikimedia.org/wiki/User:Pine ) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
The tech wishlist is awesome, and they do a lot of great work.
However, I don't think this type of democratic-driven development is appropriate for all things. If it were we would just get rid of all the other dev teams and just have a wish-list. In this case what is needed is an anti-abuse strategy, not just a one-off feature. This involves development of many features over the long term, maintenance, long-term product management, integration into the whole etc. Even in real life, nobody ever votes for maintenance until its way too late and everything is about to explode. Not to mention the product research aspect of it - wishlist inherently encourages people to think inside the box as it is basically asking the question of what's wrong with the current box. You can't vote for something if you don't realize its a choice.
As other's have mentioned, majority rules is also sometimes not the appropriate way to choose what to do. Sometimes there are things that only affect a minority, but its an important minority. Sometimes there are things that affect everyone slightly and they win over things that affect a small class significantly (Of course both types of things are important). Sometimes there are things that are long term important but short term unimportant [Not saying that people can't vote rationally for long term tasks, just that the wishlist is mostly developed around the idea of short term tasks, short enough you can do about 10 of them in a year].
-- Brian
On Mon, Feb 11, 2019 at 5:18 PM Jonathan Morgan jmorgan@wikimedia.org wrote:
This may be naive, but... isn't the wishlist filling this need? And if not through a consensus-driven method like the wishlist, how should a WMF team prioritize which power user tools it needs to focus on?
Or is just a matter of "Yes, wishlist, but more of it"?
- Jonathan
On Mon, Feb 11, 2019 at 2:34 AM bawolff bawolff+wn@gmail.com wrote:
Sure its certainly a front we can do better on.
I don't think Kasada is a product that's appropriate at this time. Ignoring the ideological aspect of it being non-free software, there's a lot of easy things we could and should try first.
However, I'd caution against viewing this as purely a technical problem. Wikimedia is not like other websites - we have allowable bots. For many commercial websites, the only good bot is a dead bot. Wikimedia has many good bots. On enwiki usually they have to be approved, I don't think that's true on all wikis. We also consider it perfectly ok to do limited testing of bots before it is approved. We also encourage the creation of alternative "clients", which from a server perspective looks like a bot. Unlike other websites where anything non-human is evil, here we need to ensure our blocking corresponds to social norms of the community. This may sound not that hard, but I think it complicates botblocking more than is obvious at first glance.
Second, this sort of thing is something that tends to far through the cracks at WMF. AFAIK the last time there was a team responsible for admin tools & anti-abuse was 2013 ( https://www.mediawiki.org/wiki/Admin_tools_development). I believe (correct me if I'm wrong) that anti-harrasment team is all about human harassment and not anti-abuse in this sense. Security is adjacent to this problem, but traditionally has not considered this problem in scope. Even core tools like checkuser have been largely ignored by the foundation for many many years.
I guess this is a long winded way of saying - I think there should be a team responsible for this sort of stuff at WMF, but there isn't one. I think there's a lot of rather easy things we can try (Off the top of my head: Better captchas. More adaptive rate limits that adjust based on how evilish you look, etc), but they definitely require close involvement with the community to ensure that we do the actual right thing.
-- Brian (p.s. Consider this a volunteer hat email)
On Sun, Feb 10, 2019 at 6:06 AM Pine W wiki.pine@gmail.com wrote:
To clarify the types of unwelcome bots that we have, here are the ones
that
I think are most common:
Spambots
Vandalbots
Unauthorized bots which may be intended to act in good faith but
which
may cause problems that could probably have been identified during
standard
testing in Wikimedia communities which have a relatively well developed
bot
approval process. (See https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
Maybe unwelcome bots are not a priority for WMF at the moment, in which case I could add this subject into a backlog. I am sorry if I sound
grumpy
at WMF regarding this subject; this is a problem but I know that there
are
millions of problems and I don't expect a different project to be
dropped
in order to address this one.
While it is a rough analogy, I think that this movie clip helps to illustrate a problem of bad bots. Although the clip is amusing, I am not amused by unwelcome bots causing problems on ENWP or anywhere else in
the
Wikiverse. https://www.youtube.com/watch?v=lokKpSrNqDA
Thanks,
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine@gmail.com wrote:
OK. Yesterday I was looking with a few other ENWP people at what I
think
was a series of edits by either a vandal bot or an inadequately
designed
and unapproved good faith bot. I read that it made approximately 500
edits
before someone who knew enough about ENWP saw what was happening and
did
something about it. I don't know how many problematic bots we have, in addition to vandal bots, but I am confident that they drain a
nontrivial
amount of time from stewards, admins, and patrollers.
I don't know how much of a priority WMF places on detecting and
stopping
unwelcome bots, but I think that the question of how to decrease the numbers and effectiveness of unwelcome bots would be a good topic for
WMF
to research.
Pine ( https://meta.wikimedia.org/wiki/User:Pine )
On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza gtisza@wikimedia.org
wrote:
On Fri, Feb 8, 2019 at 6:20 PM Pine W wiki.pine@gmail.com wrote:
I don't know how practical it would be to implement an approach
like
this
in the Wikiverse, and whether licensing proprietary technology
would
be
required.
They are talking about Polyform [1], a reverse proxy that filters
traffic
with a combination of browser fingerprinting, behavior analysis and
proof
of work. Proof of work is not really useful unless you have huge levels of bot traffic from a single bot operator (also it means locking out users
with
no Javascript); browser and behavior analysis very likely cannot be outsourced to a third party for privacy reasons. Maybe we could do it ourselves (although it would still bring up interesting questions privacy-wise)
but
it would be a huge undertaking.
[1] https://www.kasada.io/product/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jonathan T. Morgan Senior Design Researcher Wikimedia Foundation User:Jmorgan (WMF) https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)
Since we're discussing how the Tech Wishlist works then I will comment on a few points specifically regarding that wishlist.
1. A gentle correction: the recommendations are ranked by vote, not by consensus. This has pros and cons.
2a. If memory serves me correctly, the wishlist process was designed by WMF rather than designed by community consensus. I may be wrong about this, but in my search of historical records I have not found evidence to the contrary. I think that redesigning the process would be worth considering, and I hope that a redesign would help to account for the types of needs that bawolff described in his second paragraph.
2b.. I think that it's an overstatement to say that "nobody ever votes for maintenance until its way too late and everything is about to explode". I think that many non-WMF people are aware of our backlogs, the endless requests for help and conflict resolution, and the many challenges of maintaining what we have with the current population of skilled and good faith non-WMF people. However, I have the impression that there is a common *tendency* among humans in general to chase shiny new features instead of doing mostly thankless work, and I agree that the tech wishlist is unlikely even in a redesigned form to be well suited for long term planning. I think that WMF's strategy process may be a better way to plan for the long term, including for maintenance activities that are mostly thankless and do not necessarily correlate with increasing someone's personal power, making their resume look better, or having fun. Fortunately the volunteer mentality of many non-WMF people means that we do have people who are willing to do mostly thankless, mundane, and/or stressful work, and I think that some of us feel that our work is important for maintaining the encyclopedia even when we do not enjoy it, but we have a finite supply of time from such people.
I actually meant a different type of maintenance.
Maintaining the encyclopedia (and other wiki projects) is of course an activity that needs software support.
But software is also something that needs maintenance. Technology, standards, circumstances change over time. Software left alone will "bitrot" over time. A long term technical strategy to do anything needs to account for that, plan for that. One off feature development does not. Democratically directed one-off feature development accounts for that even less.
In response to Johnathan: So lets say that ORES/magic AI detects something is a bot. Then what? That's a small part of the picture. In fact you don't even need AI to do this, plenty of the vandal bots have generic programming language user-agents (AI could of course be useful for long-tail here, but there's much simpler stuff to start off with). Do we expose this to abusefilter somehow? Do we add a tag to mark it in RC/watchlist? Do we block it? Do we rate limit it? What amount of false positives are acceptable? What is the UI for all this? To what extent is this hard coded, and to what extent do communities control the feature? etc
We don't need products to detect bots. Making products to detect bots is easy. We need product managers to come up with socio-technical systems that make sense in our special context.
-- Brian
On Tue, Feb 12, 2019 at 8:36 PM Pine W wiki.pine@gmail.com wrote:
Since we're discussing how the Tech Wishlist works then I will comment on a few points specifically regarding that wishlist.
- A gentle correction: the recommendations are ranked by vote, not by
consensus. This has pros and cons.
2a. If memory serves me correctly, the wishlist process was designed by WMF rather than designed by community consensus. I may be wrong about this, but in my search of historical records I have not found evidence to the contrary. I think that redesigning the process would be worth considering, and I hope that a redesign would help to account for the types of needs that bawolff described in his second paragraph.
2b.. I think that it's an overstatement to say that "nobody ever votes for maintenance until its way too late and everything is about to explode". I think that many non-WMF people are aware of our backlogs, the endless requests for help and conflict resolution, and the many challenges of maintaining what we have with the current population of skilled and good faith non-WMF people. However, I have the impression that there is a common *tendency* among humans in general to chase shiny new features instead of doing mostly thankless work, and I agree that the tech wishlist is unlikely even in a redesigned form to be well suited for long term planning. I think that WMF's strategy process may be a better way to plan for the long term, including for maintenance activities that are mostly thankless and do not necessarily correlate with increasing someone's personal power, making their resume look better, or having fun. Fortunately the volunteer mentality of many non-WMF people means that we do have people who are willing to do mostly thankless, mundane, and/or stressful work, and I think that some of us feel that our work is important for maintaining the encyclopedia even when we do not enjoy it, but we have a finite supply of time from such people.
Pine ( https://meta.wikimedia.org/wiki/User:Pine ) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Brian,
I think we may be talking past each other. I'm Mr. Socio-technical systems. I thought what was being requested was a way to detect bots.
I maintain my own bots, work extensively with product teams, and have a deep and abiding familiarity with the complexity of designing effective tools for WIkipedia.
- J
On Wed, Feb 13, 2019 at 4:14 AM bawolff bawolff+wn@gmail.com wrote:
I actually meant a different type of maintenance.
Maintaining the encyclopedia (and other wiki projects) is of course an activity that needs software support.
But software is also something that needs maintenance. Technology, standards, circumstances change over time. Software left alone will "bitrot" over time. A long term technical strategy to do anything needs to account for that, plan for that. One off feature development does not. Democratically directed one-off feature development accounts for that even less.
In response to Johnathan: So lets say that ORES/magic AI detects something is a bot. Then what? That's a small part of the picture. In fact you don't even need AI to do this, plenty of the vandal bots have generic programming language user-agents (AI could of course be useful for long-tail here, but there's much simpler stuff to start off with). Do we expose this to abusefilter somehow? Do we add a tag to mark it in RC/watchlist? Do we block it? Do we rate limit it? What amount of false positives are acceptable? What is the UI for all this? To what extent is this hard coded, and to what extent do communities control the feature? etc
We don't need products to detect bots. Making products to detect bots is easy. We need product managers to come up with socio-technical systems that make sense in our special context.
-- Brian
On Tue, Feb 12, 2019 at 8:36 PM Pine W wiki.pine@gmail.com wrote:
Since we're discussing how the Tech Wishlist works then I will comment
on a
few points specifically regarding that wishlist.
- A gentle correction: the recommendations are ranked by vote, not by
consensus. This has pros and cons.
2a. If memory serves me correctly, the wishlist process was designed by
WMF
rather than designed by community consensus. I may be wrong about this,
but
in my search of historical records I have not found evidence to the contrary. I think that redesigning the process would be worth
considering,
and I hope that a redesign would help to account for the types of needs that bawolff described in his second paragraph.
2b.. I think that it's an overstatement to say that "nobody ever votes
for
maintenance until its way too late and everything is about to explode". I think that many non-WMF people are aware of our backlogs, the endless requests for help and conflict resolution, and the many challenges of maintaining what we have with the current population of skilled and good faith non-WMF people. However, I have the impression that there is a
common
*tendency* among humans in general to chase shiny new features instead of doing mostly thankless work, and I agree that the tech wishlist is
unlikely
even in a redesigned form to be well suited for long term planning. I
think
that WMF's strategy process may be a better way to plan for the long
term,
including for maintenance activities that are mostly thankless and do not necessarily correlate with increasing someone's personal power, making their resume look better, or having fun. Fortunately the volunteer mentality of many non-WMF people means that we do have people who are willing to do mostly thankless, mundane, and/or stressful work, and I
think
that some of us feel that our work is important for maintaining the encyclopedia even when we do not enjoy it, but we have a finite supply of time from such people.
Pine ( https://meta.wikimedia.org/wiki/User:Pine ) _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Wed, Feb 13, 2019 at 12:13 PM bawolff bawolff+wn@gmail.com wrote:
I actually meant a different type of maintenance.
Maintaining the encyclopedia (and other wiki projects) is of course an activity that needs software support.
But software is also something that needs maintenance. Technology, standards, circumstances change over time. Software left alone will "bitrot" over time. A long term technical strategy to do anything needs to account for that, plan for that. One off feature development does not. Democratically directed one-off feature development accounts for that even less.
I understand. I was intending to comment on maintenance activities in general, whether that be maintenance of a city's water system, maintenance of the text of encyclopedia articles, or maintenance of software. My train of thought proceeded into a somewhat detailed commentary detail regarding maintenance of non-software Wikimedia elements. I think that the tendency to under-resource maintenance in favor of novelties is similar in many domains of human activity, but I also think that humans collectively are not so unwise that we will prefer novelties over maintenance every time that there is a referendum on whether to maintain an existing service or to create something new. {{Citation needed}}
I think that multiple good points have been raised in this thread regarding the subjects of technical and human systems for detecting and intervening against possible unflagged bots. I am wondering what a good way would be to get a WMF product manager or someone similar to dedicate time to this topic. My preference remains that one or more WMF people, or teams, add this to their list of topics to address in a future quarter such as Q1 of the WMF 2019-2020 fiscal year. I don't know how the WMF Community Tech team plans for maintenance of features after the features are initially built, debugged, and deployed, and based on the current state of this discussion I don't currently have a strong opinion regarding whether Community Tech or a different team would be best suited to work on the topic of unflagged bots. I also don't know how WMF makes decisions about what goals are for teams other than Community Tech for future quarters, but that information could be helpful to have for this conversation.
Thanks,
wikitech-l@lists.wikimedia.org