Thank you very much, Neil! This is very much helpful. :)
On Mon, 21 Dec 2020, 4:54 pm Neil Shah-Quinn, nshahquinn@wikimedia.org wrote:
Unfortunately, I can't tell you anything more than what you already know! I think that huge, temporary spikes in edits or pageviews that don't match expected patterns of human use (like the death of a celebrity or a big editing campaign) are most likely caused by bots. With editing spikes, I can usually confirm this belief by examining the edits. With pageview spikes, it's much harder. If the spike was in the last 90 days, I could investigate more by looking at the confidential raw traffic data https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest, but after 90 days, that data is deleted to protect user privacy.
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest The case you mentioned fits all my criteria. First, it is a huge, temporary spike. Second, it doesn't match expected patterns of human use: there is no matching spike in mobile pageviews and the pages involved are not pages humans would want to read. So, it's for these reasons only that I am confident that it was caused by bots.
Now, *why* would someone use a bot to access millions of Bangla Wikipedia articles for a single month? I have no idea. It could just be a programmer somewhere doing an experiment. Your guess is as good as mine 😊
On Fri, 18 Dec 2020 at 21:52, Ankan Ghosh Dastider < ankanghoshdastider@gmail.com> wrote:
Hi Neil,
Thank you very much for responding so fast.
That's can be the potential answer! Can you please share any definite (or relative) information regarding the error at that time, if possible? Can you give me any idea on why the bot view increases so much on a certain year (and on some certain dates)? If possible, any example will be really helpful.
Ankan
On Fri, Dec 18, 2020 at 10:01 PM Neil Shah-Quinn < nshahquinn@wikimedia.org> wrote:
That's a good question! I think the most likely explanation is that a bot automatically viewed those pages. I see that you have already removed "spider" and "automated" traffic in your Wikistats graphs, but those classifications are not perfect. Before March 2020, they only detected bots that explicitly marked themselves as bots. Now, our methods are more sophisticated https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/BotDetection, but I am sure they still miss some things.
On Fri, 18 Dec 2020 at 18:48, Ankan Ghosh Dastider < ankanghoshdastider@gmail.com> wrote:
Hello everyone,
I am Ankan, a Wikimedian from Bangladesh. Recently, I was searching for the Wikimedia stats website for research purposes. I got a bit curious regarding the Bengali Wikipedia total page view section https://stats.wikimedia.org/#/bn.wikipedia.org/reading/total-page-views/normal%7Cbar%7Call%7Caccess~desktop*mobile-app*mobile-web+(agent)~user%7Cmonthly, as the traffic didn't match the normal flow in January 2018 and faced a sudden surge of desktop access by users. It is unprecedented and highest till today. If you check the normal rate of desktop access, you will see that it is almost 450% than the second highest.
The pageview result suggests that the top-visited pages are category-related and date-related pages (the highest visited one is 'Category:Stubs', see here https://pageviews.toolforge.org/?project=bn.wikipedia.org&platform=desktop&agent=user&redirects=0&start=2018-01-01&end=2018-01-31&pages=%E0%A6%AC%E0%A6%BF%E0%A6%B7%E0%A6%AF%E0%A6%BC%E0%A6%B6%E0%A7%8D%E0%A6%B0%E0%A7%87%E0%A6%A3%E0%A7%80:%E0%A6%85%E0%A6%B8%E0%A6%AE%E0%A7%8D%E0%A6%AA%E0%A7%82%E0%A6%B0%E0%A7%8D%E0%A6%A3) which is quite enigmatic as these pages are hardly viewed by the general readers. The result of certain dates in January 2018 is completely exceptional.
Note that, I have checked some other languages and the rate is normal there.
I am seeking your assistance to analyze the probable reason behind this surge. Thanks in advance!
Best regards, Ankan
-- Ankan Ghosh Dastider (he/him) User:ANKAN https://meta.wikimedia.org/wiki/User:ANKAN || All Wikimedia Foundation https://meta.wikimedia.org/wiki/Wikimedia_Foundation's public Wiki Executive Member || Wikimedia Bangladesh http://wikimedia.org.bd/ Twitter https://twitter.com/Iagdastider | LinkedIn https://www.linkedin.com/in/ankan-ghosh-dastider/ | ResearchGate https://www.researchgate.net/profile/Ankan_Ghosh_Dastider _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Neil Shah-Quinn senior data scientist, Product Analytics https://www.mediawiki.org/wiki/Product_Analytics Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Ankan Ghosh Dastider (he/him) User:ANKAN https://meta.wikimedia.org/wiki/User:ANKAN || All Wikimedia Foundation https://meta.wikimedia.org/wiki/Wikimedia_Foundation's public Wiki Executive Member || Wikimedia Bangladesh http://wikimedia.org.bd/ Twitter https://twitter.com/Iagdastider | LinkedIn https://www.linkedin.com/in/ankan-ghosh-dastider/ | ResearchGate https://www.researchgate.net/profile/Ankan_Ghosh_Dastider _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Neil Shah-Quinn senior data scientist, Product Analytics https://www.mediawiki.org/wiki/Product_Analytics Wikimedia Foundation https://wikimediafoundation.org/ _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics