Hi Jeremiah,
I hope you don't mind that I've cc-ed our public mailing list, which is where discussions like this belong. Dear list, Jeremiah is asking about our plans for the future of the pageview API.
We definitely see the API as a priority. Right now we are fixing bugs and improving capacity and loading processes, so maintenance. We are a small team and we want to make sure we have a solid platform on which to build new features.
But, we are increasingly committed to publishing open data, and actively working through the tricky security and privacy implications, that's one of our main goals this quarter [1].
As for the date range available, we only have the quality source data we need going back to May 1st, 2015. We will finish back-filling to that date but we can't go further back since we delete the more sensitive raw logs that we generate this data from (for privacy reasons).
To follow our work in general, please see our backlog [2] and the tag for the pageview api [3]. We also have a large amount of requests that we haven't turned into individual tasks, they are in the form of a conversation here: https://phabricator.wikimedia.org/T112956
[1] https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Analyt... [2] https://phabricator.wikimedia.org/tag/analytics-kanban/ [3] https://phabricator.wikimedia.org/tag/pageviews-api/ (this is new so we are still tagging the relevant tasks).
On Wednesday, February 17, 2016, Jeremiah Lewis jeremiah.lewis@razorfish.de wrote:
Hey Dan,
Oliver Keyes, with whom I've been chatting with about his pageviews package for R, gave me your contact details. Oliver indicated that you are currently in charge of the pageviews rest API and might be able to shed light on the roadmap for future releases.
Although I just stumbled across the dataset this past month, the wikimedia pageviews data appears to be a wonderful resource with a breadth and depth unusual for open source data sets about internet users and the topics they are interested in.
That said, the API is currently limited in the date range of data which it displays (only a bit over 5 months, if I'm not mistaken). What are the next steps for the API? Is it being actively maintained? Is it seen as a priority for the Wikimedia analytics team? Or are you still trying to figure out if there is enough interest in a product based on the pageviews data?
For both personal and business reasons, I'd love to see this API expanded and actively kept up; let me know if there's anything I could do vis a vis publicity (i.e. blogpost, etc.) that might give this dataset higher priority internally.
I look forward to hearing your thoughts on the API's future.
Best,
Jeremiah
[image: cid:66B437D7-2F0B-4676-931F-BE071C62AA24@hsd1.wa.comcast.net.] *Jeremiah Lewis* / Junior Business Analyst /// Skype: jpsl91
Razorfish GmbH Stralauer Allee 2b 10245 Berlin
Chamber of Commerce: Frankfurt am Main – HRB 45639, company registered and located in Frankfurt am Main. Directors: Sascha Martini, Ariel Marciano. Authorized signatory: Kai Greib.
Disclaimer The information in this email and any attachments may contain proprietary and confidential information that is intended for the addressee(s) only. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, retention or use of the contents of this information is prohibited. When addressed to our clients or vendors, any information contained in this e-mail or any attachments is subject to the terms and conditions in any governing contract. If you have received this e-mail in error, please immediately contact the sender and delete the e-mail.