--- El vie, 14/11/08, Erik Zachte <erikzachte(a)infodisiac.com> escribió:
De: Erik Zachte <erikzachte(a)infodisiac.com>
Asunto: RE: [Wiki-research-l] "Regular contributor"
Para: "'Research into Wikimedia content and communities'"
<wiki-research-l(a)lists.wikimedia.org>rg>, glimmer_phoenix(a)yahoo.es
Fecha: viernes, 14 noviembre, 2008 2:40
Many bots that are active on many wikis are not registered
as such on
smaller wikis.
Therefore I treat any user name that is registered as bot
on 10+ wikis as
bot on all wikis.
Seems very reasonable :).
It is of course again an correction which is not 100%
accurate, but close I
might hope.
Paraphrasing one of my research colleagues: it's better
something than nothing at all :).
Single User Logon can help in this respect some day.
Wow, man. That would let my model jump to the speedlight.
If only I were capable of tracing users among different
languages...
In theory we could spot some bots by their behavior, say a
user that edits
24 hours per day, of manages 5 updates per second for a
long time, or added
thousands of articles in a short period.
But I’m not sure it would be worth the effort, and it
would low priority in
any case.
I also have my doubts about the filtering conditions. For
instance, in eswiki, 'BOTpolicia' is not registered as such
and it's responsible for more than 90.000 edits, so far. On
the other hand, a famous user in eswiki (retired for this
moment, id=13770 to be precise) is responsible for
100.000 edits, and was erroneously identified as a
bot many times :). We have similar cases in other
languages.
Filtering by number of edits/hour or similar may require
a lot of time/resources, specially in larger Wikipedias,
(sorry, but for my thesis I'm mainly focused on the top-ten
Wikipedias :) ).
Honestly, I don't have a good answer for this right now.
Best.
F.
Erik
From: wiki-research-l-bounces(a)lists.wikimedia.org
[mailto:wiki-research-l-bounces@lists.wikimedia.org] On
Behalf Of Ziko van
Dijk
Sent: Thursday, November 13, 2008 23:37
To: glimmer_phoenix(a)yahoo.es; Research into Wikimedia
content and
communities
Subject: Re: [Wiki-research-l] "Regular
contributor"
Hello Felipe,
Maybe we speak about different things now. At
http://stats.wikimedia.org/EN/BotActivityMatrix.htm
de
<http://stats.wikimedia.org/EN/TablesWikipediaDE.htm>
ja
<http://stats.wikimedia.org/EN/TablesWikipediaJA.htm>
fr
<http://stats.wikimedia.org/EN/TablesWikipediaFR.htm>
it
<http://stats.wikimedia.org/EN/TablesWikipediaIT.htm>
pl
<http://stats.wikimedia.org/EN/TablesWikipediaPL.htm>
es
<http://stats.wikimedia.org/EN/TablesWikipediaES.htm>
nl
<http://stats.wikimedia.org/EN/TablesWikipediaNL.htm>
pt
<http://stats.wikimedia.org/EN/TablesWikipediaPT.htm>
ru
<http://stats.wikimedia.org/EN/TablesWikipediaRU.htm>
zh
<http://stats.wikimedia.org/EN/TablesWikipediaZH.htm>
sv
<http://stats.wikimedia.org/EN/TablesWikipediaSV.htm>
fi
<http://stats.wikimedia.org/EN/TablesWikipediaFI.htm>
8%
6%
22%
25%
26%
15%
29%
30%
26%
15%
23%
22%
The bot share of all edits is not that insignificant.
Ziko
2008/11/13 Felipe Ortega <glimmer_phoenix(a)yahoo.es>
Hi, Erik, and all.
IMHO, it would be a good idea...but not definitely an
urgent one. In our
analyses on the top-ten Wikipedias, we found that bots
contributions
introduced very few noise in data (to be precise
statistically, it was not
significant at all).
You also have the additional problem that some bots are not
identified in
the users_group table.
My "practical impression" is that when you deal
with overall figures, then
bots are irrelevant. However, if you want to focus in
special metrics like
concentration indexes then their contribution DOES MATTER,
since a very
active bot in one month may ruin your measurments.
Regards,
Felipe.
--- El mié, 22/10/08, Erik Zachte
<erikzachte(a)infodisiac.com> escribió:
De: Erik Zachte
<erikzachte(a)infodisiac.com>
Asunto: [Wiki-research-l] "Regular
contributor"
Para: wiki-research-l(a)lists.wikimedia.org
Fecha: miércoles, 22 octubre, 2008 9:55
Statistics, with "Wikipedians",
"active" and "very active
users";
> like often, Zachte's Statistics are great,
but
easily misleading.
Also keep in mind that most figures in wikistats still
include bot edits.
IMO it becomes more and more urgent to present
separate
counts for humans
and bots.
For instance in eo: 54% of total edits for all time
were
bot edits, but most
of these will be from recent years, so the percentage
will
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Ziko van Dijk
NL-Silvolde