On Fri, Jul 1, 2016 at 6:47 AM, Nuria Ruiz nuria@wikimedia.org wrote:
POST requests are more tricky, I suppose.
FYI that we do not have post data neither responses to either get or post requests, we just store urls and http codes for both get and post. Thus the body of the post is also not available.
For requests to api.php that hit the backend api servers we have the ApiAction dataset in Hadoop [0] which includes detailed data on the request parameters. There is also a refined dataset based on the raw ApiAction data in the 'bd808' database [1] that may or may not be easier to work with. The ETL for that refined data needs to be converted to Oozie jobs and moved to the 'wmf' database [2], but for now I have some adhoc scripting running on stat1002 that updates it daily.
SELECT SUM(viewcount) as views FROM bd808.action_action_hourly WHERE year = 2016 AND month = 6 AND action = 'wbgetclaims' ;
Total MapReduce CPU Time Spent: 6 minutes 52 seconds 920 msec OK views 5146909 Time taken: 111.763 seconds, Fetched: 1 row(s)
[0]: https://wikitech.wikimedia.org/wiki/Analytics/Data/ApiAction [1]: https://phabricator.wikimedia.org/T116065#2151185 [2]: https://phabricator.wikimedia.org/T137321
Bryan