Hi Matthew,
I don't know too many of the back-end details, but I've used the traffic files a fair bit. They count the raw requests, even for pages that don't exist or are just redirects to pages with normalized titles. "Main_page" is a redirect for "Main_Page". So when someone requests "Main_page" it gets counted and then so does "Main_Page" after the redirect occurs. Looking through one of the traffic files I see:
{en,MAIN_PAGE,2,36908} {en,Main+Page,2,36907} {en,Main-Page,5,92053} {en,Main-page,2,36896} {en,MainPage,4,72800} {en,Main_Page,316431,8344162628} {en,Main_Page/android-app:/org.wikipedia/http/en.m.wikipedia.org/wiki/Main_Page,21,544446} {en,Main_page,181,4150875} {en,Mainpage,13,183594} {en,main_page,2,36934}
Date: Sat, 28 Feb 2015 15:14:47 -0500 From: ruttleym@googlemail.com To: analytics@lists.wikimedia.org Subject: [Analytics] Odd data in dumps
Hi Analytics, I've been digging through some of the wiki page count files and found some strange results.In several files, the Main_page visit count is vastly lower than expected: mruttley$ cat pagecounts-20141101-170000 | grep "^en Main_page" en Main_page 260 6202982 mruttley$ cat pagecounts-20150201-170000 | grep "^en Main_page"
en Main_page 200 4802139
Only 260 and 200 page views! What do you reckon? Am I doing it wrong? Best regards, Matthew
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics