It is extremely easy to detect a bot unless the bot operator chose to make
it hard. Just make a model for how the user interacts with the input
devices, and do anomaly detection. That imply use of Javascript though, but
users not using JS are either very dubious or quite well-known. There are
nearly no new users that does not use JS.
Reused a previous tex-file, and did not clean it up? "Magnetic Normal Modes
of Bi-Component Permalloy Structures" ;)
On Mon, Feb 11, 2019 at 6:47 PM Aaron Halfaker <ahalfaker(a)wikimedia.org>
wrote:
We've been working on unflagged bot detection on my team. It's far from a
real product integration, but we have shown that it works in practice. We
tested this in Wikidata, but I don't see a good reason why a similar
strategy wouldn't work for English Wikipedia.
Hall, A., Terveen, L., & Halfaker, A. (2018). Bot Detection in Wikidata
Using Behavioral and Other Informal Cues.
*Proceedings of the ACM on Human-Computer Interaction*, *2*(CSCW), 64.
pdf
<https://dl.acm.org/ft_gateway.cfm?id=3274333&type=pdf>
In theory, we could get this into ORES if there was strong demand. As
Pine
points out, we'd need to delay some other
projects. For reference, the
next thing on the backlog that I'm looking at is setting article quality
prediction for Swedish Wikipedia.
-Aaron
On Mon, Feb 11, 2019 at 11:19 AM Jonathan Morgan <jmorgan(a)wikimedia.org>
wrote:
> This may be naive, but... isn't the wishlist filling this need? And if
not
> through a consensus-driven method like the
wishlist, how should a WMF
team
> prioritize which power user tools it needs to
focus on?
>
> Or is just a matter of "Yes, wishlist, but more of it"?
>
> - Jonathan
>
> On Mon, Feb 11, 2019 at 2:34 AM bawolff <bawolff+wn(a)gmail.com> wrote:
>
> > Sure its certainly a front we can do better on.
> >
> > I don't think Kasada is a product that's appropriate at this time.
> Ignoring
> > the ideological aspect of it being non-free software, there's a lot of
> easy
> > things we could and should try first.
> >
> > However, I'd caution against viewing this as purely a technical
problem.
> > Wikimedia is not like other websites - we
have allowable bots. For
many
> > commercial websites, the only good bot is a
dead bot. Wikimedia has
many
> > good bots. On enwiki usually they have to be
approved, I don't think
> that's
> > true on all wikis. We also consider it perfectly ok to do limited
testing
> > of bots before it is approved. We also
encourage the creation of
> > alternative "clients", which from a server perspective looks like a
bot.
> > Unlike other websites where anything
non-human is evil, here we need
to
> > ensure our blocking corresponds to social
norms of the community. This
> may
> > sound not that hard, but I think it complicates botblocking more than
is
> > obvious at first glance.
> >
> > Second, this sort of thing is something that tends to far through the
> > cracks at WMF. AFAIK the last time there was a team responsible for
admin
> > tools & anti-abuse was 2013 (
> >
https://www.mediawiki.org/wiki/Admin_tools_development). I believe
> > (correct
> > me if I'm wrong) that anti-harrasment team is all about human
harassment
> > and not anti-abuse in this sense. Security
is adjacent to this
problem,
> but
> > traditionally has not considered this problem in scope. Even core
tools
> > like checkuser have been largely ignored by
the foundation for many
many
> > years.
> >
> > I guess this is a long winded way of saying - I think there should be
a
> > team responsible for this sort of stuff at
WMF, but there isn't one. I
> > think there's a lot of rather easy things we can try (Off the top of
my
> > head: Better captchas. More adaptive rate
limits that adjust based on
how
> > evilish you look, etc), but they definitely
require close involvement
> with
> > the community to ensure that we do the actual right thing.
> >
> > --
> > Brian
> > (p.s. Consider this a volunteer hat email)
> >
> > On Sun, Feb 10, 2019 at 6:06 AM Pine W <wiki.pine(a)gmail.com> wrote:
> >
> > > To clarify the types of unwelcome bots that we have, here are the
ones
> > that
> > > I think are most common:
> > >
> > > 1) Spambots
> > >
> > > 2) Vandalbots
> > >
> > > 3) Unauthorized bots which may be intended to act in good faith but
> which
> > > may cause problems that could probably have been identified during
> > standard
> > > testing in Wikimedia communities which have a relatively well
developed
> > bot
> > > approval process. (See
> > >
https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval.)
> > >
> > > Maybe unwelcome bots are not a priority for WMF at the moment, in
which
> > > case I could add this subject into a
backlog. I am sorry if I sound
> > grumpy
> > > at WMF regarding this subject; this is a problem but I know that
there
> > are
> > > millions of problems and I don't expect a different project to be
> dropped
> > > in order to address this one.
> > >
> > > While it is a rough analogy, I think that this movie clip helps to
> > > illustrate a problem of bad bots. Although the clip is amusing, I am
> not
> > > amused by unwelcome bots causing problems on ENWP or anywhere else
in
> the
> > > Wikiverse.
https://www.youtube.com/watch?v=lokKpSrNqDA
> > >
> > > Thanks,
> > >
> > > Pine
> > > (
https://meta.wikimedia.org/wiki/User:Pine )
> > >
> > >
> > >
> > > On Sat, Feb 9, 2019, 1:40 PM Pine W <wiki.pine(a)gmail.com wrote:
> > >
> > > > OK. Yesterday I was looking with a few other ENWP people at what I
> > think
> > > > was a series of edits by either a vandal bot or an inadequately
> > designed
> > > > and unapproved good faith bot. I read that it made approximately
500
> > > edits
> > > > before someone who knew enough about ENWP saw what was happening
and
> > did
> > > > something about it. I don't know how many problematic bots we
have,
> in
> > > > addition to vandal bots, but I am confident that they drain a
> > nontrivial
> > > > amount of time from stewards, admins, and patrollers.
> > > >
> > > > I don't know how much of a priority WMF places on detecting and
> > stopping
> > > > unwelcome bots, but I think that the question of how to decrease
the
> > > > numbers and effectiveness of
unwelcome bots would be a good topic
for
> > WMF
> > > > to research.
> > > >
> > > > Pine
> > > > (
https://meta.wikimedia.org/wiki/User:Pine )
> > > >
> > > >
> > > > On Sat, Feb 9, 2019 at 9:24 PM Gergo Tisza
<gtisza(a)wikimedia.org>
> > wrote:
> > > >
> > > >> On Fri, Feb 8, 2019 at 6:20 PM Pine W <wiki.pine(a)gmail.com>
wrote:
> > > >>
> > > >> > I don't know how practical it would be to implement an
approach
> like
> > > >> this
> > > >> > in the Wikiverse, and whether licensing proprietary
technology
> would
> > > be
> > > >> > required.
> > > >> >
> > > >>
> > > >> They are talking about Polyform [1], a reverse proxy that
filters
> > > traffic
> > > >> with a combination of browser fingerprinting, behavior analysis
and
> > > proof
> > > >> of work.
> > > >> Proof of work is not really useful unless you have huge levels
of
> bot
> > > >> traffic from a single bot operator (also it means locking out
users
> > with
> > > >> no
> > > >> Javascript); browser and behavior analysis very likely cannot be
> > > >> outsourced
> > > >> to a third party for privacy reasons. Maybe we could do it
ourselves
> >> (although it would still bring up
interesting questions
privacy-wise)
but
>> it would be a huge undertaking.
>>
>>
>> [1]
https://www.kasada.io/product/
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Aaron Halfaker
Principal Research Scientist
Head of the Scoring Platform team
Wikimedia Foundation
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l