I was trying to get pageviews data for the Travel + Leisure Wikipedia page https://en.wikipedia.org/wiki/Travel_%2B_Leisure
It seems like the data is missing for the month of May on desktop. In particular, this link returns a Not found error:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia...
The corresponding links for April and June return data, but the last few days of April and the first few days of June are missing:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for June 1 to 5 but present June 6 onward)
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for April 25 onward)
The same is true on mobile-web.
I thought it's possible the article was deleted and then reinstated, but the revision history doesn't suggest any changes during the time period, and there is no update on the talk page and nothing in the deletion log.
Any ideas?
I've also noticed the pageviews API occasionally omitting data for a few days for other queries, though a re-query usually works to fill in the missing data. For instance, https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... originally returned no results for me but on a re-query I was able to get results. I'll share more information on this in a separate email if I'm able to reproduce.
Thank you,
Vipul
Hi Vipul! Thanks for letting us know about this. This is indeed a problem. And I think it's related to the + special character in the title of the page. I checked general traffic for English Wikipedia, and it looks OK to me. But then I checked other pages with the same + character in them, and they show the same pattern. They stop somewhere in the middle of April 24th and come back in the middle of June 6th. I created a task for this, we'll be prioritizing it soon. See: https://phabricator.wikimedia.org/T241734 Thanks a lot!
On Wed, Jan 1, 2020 at 6:39 PM Vipul Naik vipulnaik1@gmail.com wrote:
I was trying to get pageviews data for the Travel + Leisure Wikipedia page https://en.wikipedia.org/wiki/Travel_%2B_Leisure
It seems like the data is missing for the month of May on desktop. In particular, this link returns a Not found error:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia...
The corresponding links for April and June return data, but the last few days of April and the first few days of June are missing:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for June 1 to 5 but present June 6 onward)
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for April 25 onward)
The same is true on mobile-web.
I thought it's possible the article was deleted and then reinstated, but the revision history doesn't suggest any changes during the time period, and there is no update on the talk page and nothing in the deletion log.
Any ideas?
I've also noticed the pageviews API occasionally omitting data for a few days for other queries, though a re-query usually works to fill in the missing data. For instance, https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... originally returned no results for me but on a re-query I was able to get results. I'll share more information on this in a separate email if I'm able to reproduce.
Thank you,
Vipul _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
@Vipul: thanks for flagging this. We accidentally merged a change that ignored pages with a + in their title for the time period that Marcel mentioned: April 24th to June 6th. The relevant commits in our history are these:
accident: https://phabricator.wikimedia.org/rANRSd7e2b6bc1d69eeef2907df7b42bca62936149... fix: https://phabricator.wikimedia.org/rANRS561868c68415fba92f05d78bb322be8a58bce...
The raw data is purged regularly so we couldn't rebuild. We also have generally chosen to annotate our data instead of rebuilding. I have now added this incident to the relevant page:
https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData_Lake%2FTra...
On Thu, Jan 2, 2020 at 9:58 AM Marcel Ruiz Forns mforns@wikimedia.org wrote:
Hi Vipul! Thanks for letting us know about this. This is indeed a problem. And I think it's related to the + special character in the title of the page. I checked general traffic for English Wikipedia, and it looks OK to me. But then I checked other pages with the same + character in them, and they show the same pattern. They stop somewhere in the middle of April 24th and come back in the middle of June 6th. I created a task for this, we'll be prioritizing it soon. See: https://phabricator.wikimedia.org/T241734 Thanks a lot!
On Wed, Jan 1, 2020 at 6:39 PM Vipul Naik vipulnaik1@gmail.com wrote:
I was trying to get pageviews data for the Travel + Leisure Wikipedia page https://en.wikipedia.org/wiki/Travel_%2B_Leisure
It seems like the data is missing for the month of May on desktop. In particular, this link returns a Not found error:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia...
The corresponding links for April and June return data, but the last few days of April and the first few days of June are missing:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for June 1 to 5 but present June 6 onward)
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for April 25 onward)
The same is true on mobile-web.
I thought it's possible the article was deleted and then reinstated, but the revision history doesn't suggest any changes during the time period, and there is no update on the talk page and nothing in the deletion log.
Any ideas?
I've also noticed the pageviews API occasionally omitting data for a few days for other queries, though a re-query usually works to fill in the missing data. For instance, https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... originally returned no results for me but on a re-query I was able to get results. I'll share more information on this in a separate email if I'm able to reproduce.
Thank you,
Vipul _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Marcel Ruiz Forns** (he/him)* Analytics Developer @ Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Thanks for the clarification, Marcel and Dan! It's no big deal, and the clarification helps me understand what other pages might be affected.
Vipul
On Thu, Jan 2, 2020 at 8:13 AM Dan Andreescu dandreescu@wikimedia.org wrote:
@Vipul: thanks for flagging this. We accidentally merged a change that ignored pages with a + in their title for the time period that Marcel mentioned: April 24th to June 6th. The relevant commits in our history are these:
accident: https://phabricator.wikimedia.org/rANRSd7e2b6bc1d69eeef2907df7b42bca62936149... fix: https://phabricator.wikimedia.org/rANRS561868c68415fba92f05d78bb322be8a58bce...
The raw data is purged regularly so we couldn't rebuild. We also have generally chosen to annotate our data instead of rebuilding. I have now added this incident to the relevant page:
https://wikitech.wikimedia.org/w/index.php?title=Analytics%2FData_Lake%2FTra...
On Thu, Jan 2, 2020 at 9:58 AM Marcel Ruiz Forns mforns@wikimedia.org wrote:
Hi Vipul! Thanks for letting us know about this. This is indeed a problem. And I think it's related to the + special character in the title of the page. I checked general traffic for English Wikipedia, and it looks OK to me. But then I checked other pages with the same + character in them, and they show the same pattern. They stop somewhere in the middle of April 24th and come back in the middle of June 6th. I created a task for this, we'll be prioritizing it soon. See: https://phabricator.wikimedia.org/T241734 Thanks a lot!
On Wed, Jan 1, 2020 at 6:39 PM Vipul Naik vipulnaik1@gmail.com wrote:
I was trying to get pageviews data for the Travel + Leisure Wikipedia page https://en.wikipedia.org/wiki/Travel_%2B_Leisure
It seems like the data is missing for the month of May on desktop. In particular, this link returns a Not found error:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia...
The corresponding links for April and June return data, but the last few days of April and the first few days of June are missing:
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for June 1 to 5 but present June 6 onward)
https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... (data is missing for April 25 onward)
The same is true on mobile-web.
I thought it's possible the article was deleted and then reinstated, but the revision history doesn't suggest any changes during the time period, and there is no update on the talk page and nothing in the deletion log.
Any ideas?
I've also noticed the pageviews API occasionally omitting data for a few days for other queries, though a re-query usually works to fill in the missing data. For instance, https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia... originally returned no results for me but on a re-query I was able to get results. I'll share more information on this in a separate email if I'm able to reproduce.
Thank you,
Vipul _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Marcel Ruiz Forns** (he/him)* Analytics Developer @ Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics