Hello fellow wiki investigators!
I have observed that, very often in wikis, users not in the bot groups are actually behaving like bots. Since the mediawiki api doesn't restrict normal users to automatize tasks through its API, you might have a "normal" user, actually doing bot things. I would like to identify those and consider them as bots.
Is anyone aware if there's any implemented model already to classify whether an user is a bot or not?
Thanks and nice weekend!
Hi Abel,
I think you need a third category cyborg ;-)
More seriously, there is research on identifying contributor types. See our review in http://wikiworkshop.org/2017/papers/p1627-dahm.pdf section 2.2.3 on this topic. For example, Ron Meier is an account that makes use of scripts excessively. However, up on manual investigation, we got the impression that this was an actual human.
However, I have not looked at the literature on bot and vandalism detection recently. That's probably a good starting point.
All the best physikerwelt
On Sat, Jan 19, 2019 at 11:25 AM ABEL SERRANO JUSTE abeserra@ucm.es wrote:
Hello fellow wiki investigators!
I have observed that, very often in wikis, users not in the bot groups are actually behaving like bots. Since the mediawiki api doesn't restrict normal users to automatize tasks through its API, you might have a "normal" user, actually doing bot things. I would like to identify those and consider them as bots.
Is anyone aware if there's any implemented model already to classify whether an user is a bot or not?
Thanks and nice weekend!
-- Saludos, Abel. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Aside from the sensitivities of this, and yes if there wasn't any doubt calling an editor a bot is not something one should do lightly, it isn't an easy thing to either define or identify. Doing bot edits from a non bot account is a big deal on Wikipedia, I have seen an admin desysopped and then blocked for this. Please be aware that labelling goodfaith non bot editors as bots is unethical and liable to cause another clash between the community and researchers..
Edits per minute might at first glance look like a safe way to go, but then you realise that some people will spend a long time manually building up to a situation where they click a button and that completes dozens of edits almost simultaneously.
Type of edit and similarity of a series of edits might look like a good way to go, but what you will have difficulty identifying is that the person who seems to be making a series of edits without individual consideration may be working their way through a list of possible edits and clicking save or skip on each of them as a manual decision. Judging the results from the edits saved without knowing what led up to saving those edits won't tell you if an edit was a bot edit.
What you can do is look for dormant accounts that are no longer flagged as bots. On the English language Wikipedia we have a list of them at https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_number_of_edi... other language versions may have similar lists and are likely to have the same process of removing bot flags from bot accounts that retire.
Regards
Jonathan
On Sat, 19 Jan 2019 at 10:24, ABEL SERRANO JUSTE abeserra@ucm.es wrote:
Hello fellow wiki investigators!
I have observed that, very often in wikis, users not in the bot groups are actually behaving like bots. Since the mediawiki api doesn't restrict normal users to automatize tasks through its API, you might have a "normal" user, actually doing bot things. I would like to identify those and consider them as bots.
Is anyone aware if there's any implemented model already to classify whether an user is a bot or not?
Thanks and nice weekend!
-- Saludos, Abel. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I want to remove bot users from my research since they inject a lot of noise on the data and do not represent human collaboration or community actual status. The aim of the model would be to detect actual (or mostly-behaving-as) bot users but not flagged as *'bot'* in the mediawiki *bot* group; just to get rid them off from my analysis in this way, and it would not meant to be used to label users within the mediawiki communities.
I came up with this question since I was studying the wiki: https://cocktails.wikia.com and I found that, one of the most prolific users is "IngredientSortBot" which, besides its name, has a history of edits very characteristic for a bot user: https://cocktails.wikia.com/wiki/Special:Contributions/IngredientSortBot; but it's not included in any bot group and, because of that, it was included in my analysis and thus, biasing it.
El sáb., 19 ene. 2019 a las 20:42, WereSpielChequers (< werespielchequers@gmail.com>) escribió:
Aside from the sensitivities of this, and yes if there wasn't any doubt calling an editor a bot is not something one should do lightly, it isn't an easy thing to either define or identify. Doing bot edits from a non bot account is a big deal on Wikipedia, I have seen an admin desysopped and then blocked for this. Please be aware that labelling goodfaith non bot editors as bots is unethical and liable to cause another clash between the community and researchers..
Edits per minute might at first glance look like a safe way to go, but then you realise that some people will spend a long time manually building up to a situation where they click a button and that completes dozens of edits almost simultaneously.
Type of edit and similarity of a series of edits might look like a good way to go, but what you will have difficulty identifying is that the person who seems to be making a series of edits without individual consideration may be working their way through a list of possible edits and clicking save or skip on each of them as a manual decision. Judging the results from the edits saved without knowing what led up to saving those edits won't tell you if an edit was a bot edit.
What you can do is look for dormant accounts that are no longer flagged as bots. On the English language Wikipedia we have a list of them at
https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_number_of_edi... other language versions may have similar lists and are likely to have the same process of removing bot flags from bot accounts that retire.
Regards
Jonathan
On Sat, 19 Jan 2019 at 10:24, ABEL SERRANO JUSTE abeserra@ucm.es wrote:
Hello fellow wiki investigators!
I have observed that, very often in wikis, users not in the bot groups
are
actually behaving like bots. Since the mediawiki api doesn't restrict normal users to automatize tasks through its API, you might have a
"normal"
user, actually doing bot things. I would like to identify those and consider them as bots.
Is anyone aware if there's any implemented model already to classify whether an user is a bot or not?
Thanks and nice weekend!
-- Saludos, Abel. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
The most recent of IngredientSortBot's 764 edits was in 2007, so if that wiki has a bot flagging system the bot flag would have likely been removed in the last decade. But if 764 edits makes them significant on that wiki I doubt that wiki ever introduced bot flagging.
You can make the assumption that editors with names ending Bot are bots and on English language wikis you are pretty safe. If you made the assumption that accounts ending bot were bots you would lose a bit, three of the 5,000 most active accounts on the English wikipedia are longstanding accounts that include bot but were created before the rule about usernames ending bot being reserved for bots.
If you want to filter out edits that *do not represent human collaboration or community actual status *then you might also want to filter out, or better give a low weighting to edits flagged as "minor". That feature is heavily used on wikipedia.
Jonathan
On Fri, 25 Jan 2019 at 21:08, ABEL SERRANO JUSTE abeserra@ucm.es wrote:
I want to remove bot users from my research since they inject a lot of noise on the data and do not represent human collaboration or community actual status. The aim of the model would be to detect actual (or mostly-behaving-as) bot users but not flagged as *'bot'* in the mediawiki *bot* group; just to get rid them off from my analysis in this way, and it would not meant to be used to label users within the mediawiki communities.
I came up with this question since I was studying the wiki: https://cocktails.wikia.com and I found that, one of the most prolific users is "IngredientSortBot" which, besides its name, has a history of edits very characteristic for a bot user: https://cocktails.wikia.com/wiki/Special:Contributions/IngredientSortBot; but it's not included in any bot group and, because of that, it was included in my analysis and thus, biasing it.
El sáb., 19 ene. 2019 a las 20:42, WereSpielChequers (< werespielchequers@gmail.com>) escribió:
Aside from the sensitivities of this, and yes if there wasn't any doubt calling an editor a bot is not something one should do lightly, it isn't
an
easy thing to either define or identify. Doing bot edits from a non bot account is a big deal on Wikipedia, I have seen an admin desysopped and then blocked for this. Please be aware that labelling goodfaith non bot editors as bots is unethical and liable to cause another clash between
the
community and researchers..
Edits per minute might at first glance look like a safe way to go, but
then
you realise that some people will spend a long time manually building up
to
a situation where they click a button and that completes dozens of edits almost simultaneously.
Type of edit and similarity of a series of edits might look like a good
way
to go, but what you will have difficulty identifying is that the person
who
seems to be making a series of edits without individual consideration may be working their way through a list of possible edits and clicking save
or
skip on each of them as a manual decision. Judging the results from the edits saved without knowing what led up to saving those edits won't tell you if an edit was a bot edit.
What you can do is look for dormant accounts that are no longer flagged
as
bots. On the English language Wikipedia we have a list of them at
https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_by_number_of_edi...
other language versions may have similar lists and are likely to have the same process of removing bot flags from bot accounts that retire.
Regards
Jonathan
On Sat, 19 Jan 2019 at 10:24, ABEL SERRANO JUSTE abeserra@ucm.es
wrote:
Hello fellow wiki investigators!
I have observed that, very often in wikis, users not in the bot groups
are
actually behaving like bots. Since the mediawiki api doesn't restrict normal users to automatize tasks through its API, you might have a
"normal"
user, actually doing bot things. I would like to identify those and consider them as bots.
Is anyone aware if there's any implemented model already to classify whether an user is a bot or not?
Thanks and nice weekend!
-- Saludos, Abel. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Saludos, Abel. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
El vie., 25 ene. 2019 a las 22:24, WereSpielChequers (< werespielchequers@gmail.com>) escribió:
The most recent of IngredientSortBot's 764 edits was in 2007, so if that wiki has a bot flagging system the bot flag would have likely been removed in the last decade. But if 764 edits makes them significant on that wiki I doubt that wiki ever introduced bot flagging.
What is that bot flagging system about? How does it work? in Wikia there are users within the "bot" group or the "bot-global" group, (see for instance: bots for Cocktails wiki https://cocktails.wikia.com/api.php?action=query&list=groupmembers&gmgroups=bot|bot-global&gmlimit=500 ), and these have the capabilities corresponding to bots for mediawiki API, but I don't know of any other flagging system :S
You can make the assumption that editors with names ending Bot are bots and on English language wikis you are pretty safe. If you made the assumption that accounts ending bot were bots you would lose a bit, three of the 5,000 most active accounts on the English wikipedia are longstanding accounts that include bot but were created before the rule about usernames ending bot being reserved for bots.
Not really, I just successfully created an account ending in "bot". I found this criteria to filter out bots quite naive and not accurate, also it does not consider non-flagged "bots" without the substring "bot" in their name.
If you want to filter out edits that *do not represent human collaboration or community actual status *then you might also want to filter out, or better give a low weighting to edits flagged as "minor". That feature is heavily used on wikipedia.
Hum, I am not sure how popular is that feature used in Wikia. It might depend much on the experience or policy of every specific wiki and, for me, a minor edit could be still be a indicator of human collaboration, so I will rather leave them in.
Jonathan
Thank you all for your answers!
wiki-research-l@lists.wikimedia.org