+External
Hi, I realized I don't get any responses from internal--but Joseph sent me something helpful to me this morning so I saw all the responses..up to that point. I think.
Anyway, thanks for the help!! The strange thing for me seems to be that the numbers I get don't make that much sense to me. For beta, (using query below) I get:
Unique IPs num_pvs referrer
3638 5967 external
1972 5760 internal
I would have expected a much larger external-->internal referrer ratio. In other words, I would have expected that the vast majority of sessions or even ips only hit the site 1x in a given hour. Instead, I am seeing that 54% of IPs are clicking a link within that hour... I would probably expect to see #'s no more than 10%.
I am probably doing something wrong, right? I *know* that I am making convenient assumptions here that do not apply to edge cases, so let's not consider those unless you think they make a big difference. Perhaps by using the referer field I am inherently leaving out all of the external traffic for which we do not have data?
Thanks!
-J
SELECT COUNT(DISTINCT ip) AS Unique_IPs, x_analytics_map['mf-m'] AS mobile_site, count(*) AS num_pvs, CASE WHEN referer LIKE "%en.m.wikipedia%" THEN 'internal' ELSE 'external' END AS session_depth FROM wmf.webrequest WHERE TRUE = TRUE AND webrequest_source = 'mobile' AND year = 2015 AND month = 5 AND day = 25 and hour = 1 AND agent_type = "user" AND is_pageview = TRUE AND x_analytics_map['mf-m'] IS NOT NULL AND uri_host like "%en.m.wikipedia.org%" GROUP BY CASE WHEN referer LIKE "%en.m.wikipedia%" THEN 'internal' ELSE 'external' END, x_analytics_map['mf-m'] ORDER BY hits DESC LIMIT 50;
On Thu, May 28, 2015 at 2:30 PM, Jon Katz jkatz@wikimedia.org wrote:
Hi, Trying to run a hive query to rough-count number of 1-page-only, 'sessions' on mobile-web Here is the error I get
FAILED: ParseException line 15:22 missing KW_END at 'device_family' near 'device_family' line 15:35 missing EOF at ''] <> "Spider"\n AND is_pageview = TRUE\n AND x_analytics_map['' near 'device_family
Here is the query:
SELECT COUNT(DISTINCT ip) AS hits, x_analytics_map['mf-m'] AS mobile_site, count(*) AS num_pvs, CASE WHEN referer LIKE "%en.m.wikipedia%" THEN 'internal' ELSE 'Misc’ END AS session_depth FROM wmf.webrequest WHERE YEAR = 2015 AND MONTH = 5 AND DAY = 25 AND user_agent_map['device_family'] <> "Spider" AND is_pageview = TRUE AND x_analytics_map['mf-m'] IS NOT NULL AND uri_host like "%en.m.wikipedia.org%" GROUP BY session_depth, mobile_site ORDER BY hits DESC LIMIT 50;
Any advice?
Thanks!
Jon