Thank you very much, Neil! This is very much helpful. :)
On Mon, 21 Dec 2020, 4:54 pm Neil Shah-Quinn, <nshahquinn(a)wikimedia.org>
wrote:
Unfortunately, I can't tell you anything more than
what you already know!
I think that huge, temporary spikes in edits or pageviews that don't match
expected patterns of human use (like the death of a celebrity or a big
editing campaign) are most likely caused by bots. With editing spikes, I
can usually confirm this belief by examining the edits. With pageview
spikes, it's much harder. If the spike was in the last 90 days, I could
investigate more by looking at the confidential raw traffic data
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest>,
but after 90 days, that data is deleted to protect user privacy.
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest>
The case you mentioned fits all my criteria. First, it is a huge,
temporary spike. Second, it doesn't match expected patterns of human use:
there is no matching spike in mobile pageviews and the pages involved are
not pages humans would want to read. So, it's for these reasons only that I
am confident that it was caused by bots.
Now, *why* would someone use a bot to access millions of Bangla Wikipedia
articles for a single month? I have no idea. It could just be a programmer
somewhere doing an experiment. Your guess is as good as mine 😊
On Fri, 18 Dec 2020 at 21:52, Ankan Ghosh Dastider <
ankanghoshdastider(a)gmail.com> wrote:
Hi Neil,
Thank you very much for responding so fast.
That's can be the potential answer! Can you please share any definite (or
relative) information regarding the error at that time, if possible? Can
you give me any idea on why the bot view increases so much on a certain
year (and on some certain dates)? If possible, any example will be really
helpful.
Ankan
On Fri, Dec 18, 2020 at 10:01 PM Neil Shah-Quinn <
nshahquinn(a)wikimedia.org> wrote:
That's a good question! I think the most
likely explanation is that a
bot automatically viewed those pages. I see that you have already removed
"spider" and "automated" traffic in your Wikistats graphs, but those
classifications are not perfect. Before March 2020, they only detected bots
that explicitly marked themselves as bots. Now, our methods are more
sophisticated
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/BotDetection>,
but I am sure they still miss some things.
On Fri, 18 Dec 2020 at 18:48, Ankan Ghosh Dastider <
ankanghoshdastider(a)gmail.com> wrote:
Hello everyone,
I am Ankan, a Wikimedian from Bangladesh. Recently, I was searching for
the Wikimedia stats website for research purposes. I got a bit curious
regarding the Bengali Wikipedia total page view section
<https://stats.wikimedia.org/#/bn.wikipedia.org/reading/total-page-views/normal%7Cbar%7Call%7Caccess~desktop*mobile-app*mobile-web+(agent)~user%7Cmonthly>,
as the traffic didn't match the normal flow in January 2018 and faced a
sudden surge of desktop access by users. It is unprecedented and highest
till today. If you check the normal rate of desktop access, you will see
that it is almost 450% than the second highest.
The pageview result suggests that the top-visited pages are
category-related and date-related pages (the highest visited one is
'Category:Stubs', see here
<https://pageviews.toolforge.org/?project=bn.wikipedia.org&platform=desktop&agent=user&redirects=0&start=2018-01-01&end=2018-01-31&pages=%E0%A6%AC%E0%A6%BF%E0%A6%B7%E0%A6%AF%E0%A6%BC%E0%A6%B6%E0%A7%8D%E0%A6%B0%E0%A7%87%E0%A6%A3%E0%A7%80:%E0%A6%85%E0%A6%B8%E0%A6%AE%E0%A7%8D%E0%A6%AA%E0%A7%82%E0%A6%B0%E0%A7%8D%E0%A6%A3>)
which is quite enigmatic as these pages are hardly viewed by the general
readers. The result of certain dates in January 2018 is completely
exceptional.
Note that, I have checked some other languages and the rate is normal
there.
I am seeking your assistance to analyze the probable reason behind this
surge. Thanks in advance!
Best regards,
Ankan
--
Ankan Ghosh Dastider (he/him)
User:ANKAN <https://meta.wikimedia.org/wiki/User:ANKAN> || All Wikimedia
Foundation <https://meta.wikimedia.org/wiki/Wikimedia_Foundation>'s
public Wiki
Executive Member || Wikimedia Bangladesh <http://wikimedia.org.bd/>
Twitter <https://twitter.com/Iagdastider> | LinkedIn
<https://www.linkedin.com/in/ankan-ghosh-dastider/> | ResearchGate
<https://www.researchgate.net/profile/Ankan_Ghosh_Dastider>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Neil Shah-Quinn
senior data scientist, Product Analytics
<https://www.mediawiki.org/wiki/Product_Analytics>
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Ankan Ghosh Dastider (he/him)
User:ANKAN <https://meta.wikimedia.org/wiki/User:ANKAN> || All Wikimedia
Foundation <https://meta.wikimedia.org/wiki/Wikimedia_Foundation>'s
public Wiki
Executive Member || Wikimedia Bangladesh <http://wikimedia.org.bd/>
Twitter <https://twitter.com/Iagdastider> | LinkedIn
<https://www.linkedin.com/in/ankan-ghosh-dastider/> | ResearchGate
<https://www.researchgate.net/profile/Ankan_Ghosh_Dastider>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Neil Shah-Quinn
senior data scientist, Product Analytics
<https://www.mediawiki.org/wiki/Product_Analytics>
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics