On Fri, Jul 1, 2016 at 6:47 AM, Nuria Ruiz <nuria(a)wikimedia.org> wrote:
POST requests
are more tricky, I suppose.
FYI that we do not have post data neither responses to
either get or post
requests, we just store urls and http codes for both get and post. Thus the
body of the post is also not available.
For requests to api.php that hit the backend api servers we have the
ApiAction dataset in Hadoop [0] which includes detailed data on the
request parameters. There is also a refined dataset based on the raw
ApiAction data in the 'bd808' database [1] that may or may not be
easier to work with. The ETL for that refined data needs to be
converted to Oozie jobs and moved to the 'wmf' database [2], but for
now I have some adhoc scripting running on stat1002 that updates it
daily.
SELECT
SUM(viewcount) as views
FROM
bd808.action_action_hourly
WHERE year = 2016
AND month = 6
AND action = 'wbgetclaims'
;
Total MapReduce CPU Time Spent: 6 minutes 52 seconds 920 msec
OK
views
5146909
Time taken: 111.763 seconds, Fetched: 1 row(s)
[0]:
https://wikitech.wikimedia.org/wiki/Analytics/Data/ApiAction
[1]:
https://phabricator.wikimedia.org/T116065#2151185
[2]:
https://phabricator.wikimedia.org/T137321
Bryan
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA
irc: bd808 v:415.839.6885 x6855