Dear Sir or Madam,
Writing to you with a question about Pageviews hourly raw data files
<https://dumps.wikimedia.org/other/pageviews/readme.html>. First of all,
let me know if I chose the right person for a question. If not, could you
please advise to whom I should direct the question? The question is below.
I am working on a project where we would like to use Pageviews hourly data
<https://dumps.wikimedia.org/other/pageviews/readme.html>. For us, it is
crucial to get data as soon as possible. As I can see on the web page,
hourly data is available in the Wikimedia's file system approximately 45min
after the hour ends. But for an end-user, it is available several hours
later after that (this is shown on the screenshot).
Could you help us by answering the following questions:
1. Is there any way to get data as soon as it is available on the
Wikimedia filesystem (~45 min after the hour ends)?
2. Are there any other faster ways to get hourly data? For instance,
faster access to raw data files or access to *wmf.pageview_hourly
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageview_hourly>*
or
to *wmf.pageviews_actor
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Pageview_actor>*.
Unfortunately,
API does not provide the opportunity to get data on an hourly level.
Best regards,
Maxim Aparovich
[image: wiki-email.png]