On Tuesday, November 10, 2015, Felix J. Scholz felixjacobscholz@gmail.com wrote:
Does this mean there is a significant break in continuity for the older (pre 05/2015) stats efforts such as pagecounts-raw [1] and pagecounts-all-sites?
[1] https://dumps.wikimedia.org/other/pagecounts-raw/ [2] https://dumps.wikimedia.org/other/pagecounts-all-sites/
No, those files are intact and will continue to be generated until we have a conversation about how we're going to clean up this fairly confusing situation. Look for that discussion starting soon, we will link to it from the dumps.wikimedia.org/other page and post it on this list as well.
On Tue, Nov 10, 2015 at 7:04 PM, Nuria Ruiz <nuria@wikimedia.org
javascript:_e(%7B%7D,'cvml','nuria@wikimedia.org');> wrote:
Any ideas about why there is such a big difference between the old and
new numbers? I guess you missed the banner which explains the main differences among definitions. I have copy pasted it below.
Please take a look at the new definition here: https://meta.wikimedia.org/wiki/Research:Page_view The new definition has significant updates from the legacy one.
<banner>
In the old situation spider/crawler traffic (from search engines) wasn't filtered (a known issue, we needed hadoop to make this possible). Those requests made up almost 20% of total page 'views' on Wikipedia, and even much more on sister projects (details will follow) Obviously the top table row, with year-over-year changes, doesn't make sense until the new pageview definition will have been in effect for at least a year. That feature is now temporarily disabled.
Also, the new data stream does exclude housekeeping traffic (mainly used for fundraising banners). As usage grew these so called HideBanner requests became far more numerous even than actual traffic on smaller Wikimedia projects (not on Wikipedia). We will update the reports soon, to make the impact of this 'pollution' visible, which affected much of 2014 and first four months of 2015.
</banner>
On Tue, Nov 10, 2015 at 3:35 PM, Pine W <wiki.pine@gmail.com javascript:_e(%7B%7D,'cvml','wiki.pine@gmail.com');> wrote:
IIRC from the previous numbers showing about 8.5B pageviews for ENWP recently (I think that stat was for Sept), it looks like the new number shows a major drop to 7.5B. Any ideas about why there is such a big difference between the old and new numbers?
Thanks, Pine On Nov 10, 2015 2:02 PM, "Dan Andreescu" <dandreescu@wikimedia.org javascript:_e(%7B%7D,'cvml','dandreescu@wikimedia.org');> wrote:
\o/
On Tuesday, November 10, 2015, Nuria Ruiz <nuria@wikimedia.org javascript:_e(%7B%7D,'cvml','nuria@wikimedia.org');> wrote:
Hello!
The analytics team wishes to announce that we have finally transitioned several of the pageview reports in stats.wikimedia.org to the new pageview definition [1]. This means that we should no longer have two conflicting sources of pageview numbers.
While we are not not fully done transitioning pageview reports we feel this is an important enough milestone that warrants some communication. BIG Thanks to Erik Z. for his work on this project.
Please take a look at a report using the new definition (a banner is present when report has been updated) http://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm
Thanks,
Nuria
Analytics mailing list Analytics@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Analytics@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Analytics@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org javascript:_e(%7B%7D,'cvml','Analytics@lists.wikimedia.org'); https://lists.wikimedia.org/mailman/listinfo/analytics