Data gap in API. Hey all, does anyone know if there is a plan to get the API loaded with data for Oct 21st. Seeing a lot of language versions missing data just for that day, even if they have data the day before and after. Languages I've noticed with the gap include: Arabic, Chinese, Russian, Japanese, Turkish, VIetnamese, Thai, and Portuguese. I'm sure there are others as well
One example seen below: https://pageviews.toolforge.org/?project=zh.wikipedia.org&platform=all-a...
Very good question Joshua. The underlying API indeed doesn't have the data for Oct 21: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/zh.wikipedia... .
I quickly checked, and https://dumps.wikimedia.org/other/pageview_complete/2021/2021-10/ (txt dumps of pageviews data) has the data for that day:
[urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep '^zh.wikipedia ' pageviews-20211021-spider > pageviews-20211021-spider-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep '^zh.wikipedia' pageviews-20211021-automated > pageviews-20211021-automated-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep '^zh.wikipedia ' pageviews-20211021-user > pageviews-20211021-user-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep ' Cat ' pageviews-20211021-spider-zhwiki zh.wikipedia Cat 7535498 desktop 1 R1 [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep ' Cat ' pageviews-20211021-user-zhwiki zh.wikipedia Cat 7535498 desktop 2 A1G1 zh.wikipedia Cat 7535498 mobile-web 3 K1L2 [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep ' Cat ' pageviews-20211021-automated-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$
I filled https://phabricator.wikimedia.org/T294193 for the Analytics team to investigate more. In the meantime, you can use the dumps I linked above to get the data you need (docs are at https://dumps.wikimedia.org/other/pageview_complete/readme.html).
Thanks for noticing this.
Martin
so 23. 10. 2021 v 15:31 odesílatel Joshua Haecker josh@predata.com napsal:
Data gap in API. Hey all, does anyone know if there is a plan to get the API loaded with data for Oct 21st. Seeing a lot of language versions missing data just for that day, even if they have data the day before and after. Languages I've noticed with the gap include: Arabic, Chinese, Russian, Japanese, Turkish, VIetnamese, Thai, and Portuguese. I'm sure there are others as well
One example seen below:
https://pageviews.toolforge.org/?project=zh.wikipedia.org&platform=all-a...
Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org
Just a quick report back: this was a job that failed on Friday, and we're restarting it soon. The data should be there after it runs.
On Sun, Oct 24, 2021 at 12:42 PM Martin Urbanec martin.urbanec@wikimedia.cz wrote:
Very good question Joshua. The underlying API indeed doesn't have the data for Oct 21: https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/zh.wikipedia... .
I quickly checked, and https://dumps.wikimedia.org/other/pageview_complete/2021/2021-10/ (txt dumps of pageviews data) has the data for that day:
[urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep '^zh.wikipedia ' pageviews-20211021-spider > pageviews-20211021-spider-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep '^zh.wikipedia' pageviews-20211021-automated > pageviews-20211021-automated-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep '^zh.wikipedia ' pageviews-20211021-user > pageviews-20211021-user-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep ' Cat ' pageviews-20211021-spider-zhwiki zh.wikipedia Cat 7535498 desktop 1 R1 [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep ' Cat ' pageviews-20211021-user-zhwiki zh.wikipedia Cat 7535498 desktop 2 A1G1 zh.wikipedia Cat 7535498 mobile-web 3 K1L2 [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$ grep ' Cat ' pageviews-20211021-automated-zhwiki [urbanecm@stat1005 ~/tmp/zhwiki-pageviews-issue]$
I filled https://phabricator.wikimedia.org/T294193 for the Analytics team to investigate more. In the meantime, you can use the dumps I linked above to get the data you need (docs are at https://dumps.wikimedia.org/other/pageview_complete/readme.html).
Thanks for noticing this.
Martin
so 23. 10. 2021 v 15:31 odesílatel Joshua Haecker josh@predata.com napsal:
Data gap in API. Hey all, does anyone know if there is a plan to get the API loaded with data for Oct 21st. Seeing a lot of language versions missing data just for that day, even if they have data the day before and after. Languages I've noticed with the gap include: Arabic, Chinese, Russian, Japanese, Turkish, VIetnamese, Thai, and Portuguese. I'm sure there are others as well
One example seen below:
https://pageviews.toolforge.org/?project=zh.wikipedia.org&platform=all-a...
Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org
Analytics mailing list -- analytics@lists.wikimedia.org To unsubscribe send an email to analytics-leave@lists.wikimedia.org