Hi Jeremiah,
I hope you don't mind that I've cc-ed our public mailing list, which is
where discussions like this belong. Dear list, Jeremiah is asking about
our plans for the future of the pageview API.
We definitely see the API as a priority. Right now we are fixing bugs and
improving capacity and loading processes, so maintenance. We are a small
team and we want to make sure we have a solid platform on which to build
new features.
But, we are increasingly committed to publishing open data, and
actively working through the tricky security and privacy implications,
that's one of our main goals this quarter [1].
As for the date range available, we only have the quality source data we
need going back to May 1st, 2015. We will finish back-filling to that date
but we can't go further back since we delete the more sensitive raw logs
that we generate this data from (for privacy reasons).
To follow our work in general, please see our backlog [2] and the tag for
the pageview api [3]. We also have a large amount of requests that we
haven't turned into individual tasks, they are in the form of a
conversation here:
https://phabricator.wikimedia.org/T112956
[1]
https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Analy…
[2]
https://phabricator.wikimedia.org/tag/analytics-kanban/
[3]
https://phabricator.wikimedia.org/tag/pageviews-api/ (this is new so we
are still tagging the relevant tasks).
On Wednesday, February 17, 2016, Jeremiah Lewis <jeremiah.lewis(a)razorfish.de>
wrote:
Hey Dan,
Oliver Keyes, with whom I've been chatting with about his pageviews
package for R, gave me your contact details. Oliver indicated that you are
currently in charge of the pageviews rest API and might be able to shed
light on the roadmap for future releases.
Although I just stumbled across the dataset this past month, the
wikimedia pageviews data appears to be a wonderful resource with a breadth
and depth unusual for open source data sets about internet users and the
topics they are interested in.
That said, the API is currently limited in the date range of data which it
displays (only a bit over 5 months, if I'm not mistaken). What are the next
steps for the API? Is it being actively maintained? Is it seen as a
priority for the Wikimedia analytics team? Or are you still trying to
figure out if there is enough interest in a product based on the
pageviews data?
For both personal and business reasons, I'd love to see this API expanded
and actively kept up; let me know if there's anything I could do vis a vis
publicity (i.e. blogpost, etc.) that might give this dataset higher
priority internally.
I look forward to hearing your thoughts on the API's future.
Best,
Jeremiah
[image: cid:66B437D7-2F0B-4676-931F-BE071C62AA24@hsd1.wa.comcast.net.]
*Jeremiah Lewis* / Junior Business Analyst /// Skype: jpsl91
Razorfish GmbH
Stralauer Allee 2b
10245 Berlin
Chamber of Commerce: Frankfurt am Main – HRB 45639, company registered and
located in Frankfurt am Main. Directors: Sascha Martini, Ariel Marciano.
Authorized signatory: Kai Greib.
------------------------------------------------------------------------
Disclaimer The information in this email and any attachments may contain
proprietary and confidential information that is intended for the
addressee(s) only. If you are not the intended recipient, you are hereby
notified that any disclosure, copying, distribution, retention or use of
the contents of this information is prohibited. When addressed to our
clients or vendors, any information contained in this e-mail or any
attachments is subject to the terms and conditions in any governing
contract. If you have received this e-mail in error, please immediately
contact the sender and delete the e-mail.