I'll add that you can submit feature requests and bug reports on phabricator, and tag our team, #Data-Engineering.

On Tue, Apr 19, 2022 at 5:27 AM Joseph Allemandou <jallemandou@wikimedia.org> wrote:
Hi Ben,

pageview data is loaded daily in a cassandra backend, usually between 1am and 3am UTC depending on our cluster resource availability.
Around those hours, it is possible that you see some pageviews being loaded while others are not yet.

Another possible reason for the discrepancy you experience is caching: the AQS api is behind varnish cache, and it is possible that if you repeat queries with the same parameters an "old" (by at most 4 hours) result could be sent.

As for your question, we don't have a public place where the latest loaded day for AQS is shown - it would be a nice addition!

Best
Joseph




On Mon, Apr 18, 2022 at 9:35 PM Ben Smith <ben@predata.com> wrote:
Hello all,

We use the Wikimedia AQS Pageviews REST API: [Analytics/AQS/Pageviews - Wikitech](https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews). When making requests for pageviews counts by article, we have noticed that not all data for all pages will exist for the latest day at the same time. Some pages appear to be updated later than others. Is there a place we can check (i.e. a status page or dump files) to determine whether all pageview data is accessible for the latest day via the AQS Pageviews REST API?

Best,
Ben

_______________________________________________
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-leave@lists.wikimedia.org


--
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation
_______________________________________________
Analytics mailing list -- analytics@lists.wikimedia.org
To unsubscribe send an email to analytics-leave@lists.wikimedia.org