Yes Oliver, the agent_type = spider includes IsCrawler UDF.
On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <okeyes(a)wikimedia.org> wrote:
What does agent-type add? In the sense that if
we're pre-parsing the
user agent, surely the difference is between "WHERE agent_type !=
'spider'" and "WHERE user_agent_map['device_family'] !=
'Spider'"?
Does agent_type include the isCrawler UDF results?
On 10 April 2015 at 16:47, Joseph Allemandou <jallemandou(a)wikimedia.org>
wrote:
And I forgot one field :
is_zero - True if a request is made on a zero provider.
On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <leila(a)wikimedia.org> wrote:
>
> Hi Joseph,
>
> Thanks for the update, and for doing this. These three items make the
> analysis of the data much easier on our end. We've had many requests in
the
> past that required agent_type and
access_method information and having
them
readily
available is awesome! :-)
Have a great weekend!
Leila
On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou
<jallemandou(a)wikimedia.org> wrote:
Hi Analytics people,
Today happens another bunch of addition to the refined webrequest table
in hive.
Now the table contains:
ts - The unix timestamp (milliseconds) version of the dt date
access_method - The method used to access the site, being one of the
three [mobile app | mobile web | desktop]
agent_type - To differentiate easily between spiders and users (more
values may be added later).
These additions are based on the "tags", as defined here:
https://meta.wikimedia.org/wiki/Research:Page_view
Have a good weekend !
--
Joseph Allemandou
Data Engineer @ Wikimedia Foundation
IRC: joal
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Joseph Allemandou
Data Engineer @ Wikimedia Foundation
IRC: joal
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal