Thanks for opening the ticket and for clarifying the issue more.

On a related note, I wonder if you could add the documentation for the unusual amount of pageviews to 404.php as returned by the API. That number also shot up in October 2016; see http://wikipediaviews.org/displayviewsformultiplemonths.php?page=404.php&allmonths=allmonths&drilldown=all for the historical trend.

Vipul

On Thu, Nov 17, 2016 at 1:26 PM, Nuria Ruiz <nuria@wikimedia.org> wrote:
>Just to verify what you are saying, would it be right to say that the bug fix caused 
>a a lot of pageviews to be moved from the respective (nonexistent) pages to "-" pageviews?

No, the bugfix makes those faulty requests to no longer be stored as pageviews thus it cannot make that number increase.  I am not sure we can link the surge of "-" pageviews in October to any determined cause without further research. Have filed ticket to that extent, hopefully we can get to it before we do away with raw data: https://phabricator.wikimedia.org/T150990

>And, does that means that the current estimate of "-" pageviews is more accurate than it used to be prior to the bug fix?
No, it doesn't. 




On Thu, Nov 17, 2016 at 1:17 PM, Vipul Naik <vipulnaik1@gmail.com> wrote:
Thank you for linking to that bug, Marcel. Just to verify what you are saying, would it be right to say that the bug fix caused a a lot of pageviews to be moved from the respective (nonexistent) pages to "-" pageviews? And, does that means that the current estimate of "-" pageviews is more accurate than it used to be prior to the bug fix?

Vipul

On Wed, Nov 16, 2016 at 4:33 AM, Marcel Ruiz Forns <mforns@wikimedia.org> wrote:
Maybe the high value in October (45M) has something to do with the last changes in https://phabricator.wikimedia.org/T145922 ?

On Mon, Nov 14, 2016 at 9:25 PM, Nuria Ruiz <nuria@wikimedia.org> wrote:

On Tue, Nov 8, 2016 at 7:25 AM, Vipul Naik <vipulnaik1@gmail.com> wrote:
Hi Joseph,

Thanks for the clarification.

Any ideas why this number is much higher for some months? In particular, on desktop, it's high in the months of July to September 2015 (around 10 million, compared to the usual 5 million) and then high again in October 2016 (45 million, about 10x the usual value).

Data is from http://wikipediaviews.org/displayviewsformultiplemonths.php?page=-&allmonths=allmonths&drilldown=all which summarizes results from the Wikimedia API (and stats.grok.se for data before July 2015).

Vipul

On Tue, Nov 8, 2016 at 3:46 AM, Joseph Allemandou <jallemandou@wikimedia.org> wrote:
Hello Issa,

Thank you for your question.
The very high number of views of the "-" page is explained by this dash value being used as a special value for "no page title found" when extracting titles from urls.
We definitely should document this in the API, creating this task: https://phabricator.wikimedia.org/T150249
Best
Joseph


On Tue, Nov 8, 2016 at 12:28 AM, Issa Rice <riceissa@gmail.com> wrote:
Dear Analytics Mailing List,

Recently while querying pageviews of various pages, I discovered that
the page whose title is a single hyphen character (i.e. with the title
"-", with URL <https://en.wikipedia.org/wiki/->, which redirects to
<https://en.wikipedia.org/wiki/Hyphen-minus>) receives an unusually high
number of pageviews under the Pageview API. Taking October 2015 as an
example, the page received 5.4 million pageviews during that month
according to the API:
<https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/desktop/user/-/daily/20151001/20151031>.

However, according the stats.grok.se (which was still operational in the
same month), the page received only 1209 pageviews:
<http://stats.grok.se/en/201510/->.

Looking at the tabulation of pageviews on Wikipedia Views, the increase
in pageviews for this page coincides with the change to the Pageview
API in July 2015:
<http://wikipediaviews.org/displayviewsformultiplemonths.php?page=-&allmonths=allmonths&drilldown=all>.

As I understand, page titles must be URL-encoded before the query,
but the URL-encoding of "-" is itself.

I looked at the API documentation but did not see this behavior listed,
so I am wondering where these numbers are coming from.

Best regards,
Issa


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Joseph Allemandou
Data Engineer @ Wikimedia Foundation
IRC: joal

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Marcel Ruiz Forns
Analytics Developer
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics