Yes Oliver, the agent_type = spider includes IsCrawler UDF.

On Fri, Apr 10, 2015 at 11:08 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:
What does agent-type add? In the sense that if we're pre-parsing the
user agent, surely the difference is between "WHERE agent_type !=
'spider'" and "WHERE user_agent_map['device_family'] != 'Spider'"?
Does agent_type include the isCrawler UDF results?

On 10 April 2015 at 16:47, Joseph Allemandou <jallemandou@wikimedia.org> wrote:
> And I forgot one field :
>
> is_zero - True if a request is made on a zero provider.
>
>
> On Fri, Apr 10, 2015 at 10:36 PM, Leila Zia <leila@wikimedia.org> wrote:
>>
>> Hi Joseph,
>>
>>    Thanks for the update, and for doing this. These three items make the
>> analysis of the data much easier on our end. We've had many requests in the
>> past that required agent_type and access_method information and having them
>> readily available is awesome! :-)
>>
>> Have a great weekend!
>>
>> Leila
>>
>> On Fri, Apr 10, 2015 at 1:21 PM, Joseph Allemandou
>> <jallemandou@wikimedia.org> wrote:
>>>
>>> Hi Analytics people,
>>>
>>> Today happens another bunch of addition to the refined webrequest table
>>> in hive.
>>> Now the table contains:
>>>
>>> ts - The unix timestamp (milliseconds) version of the dt date
>>> access_method - The method used to access the site, being one of the
>>> three [mobile app | mobile web | desktop]
>>> agent_type - To differentiate easily between spiders and users (more
>>> values may be added later).
>>>
>>> These additions are based on the "tags", as defined here:
>>> https://meta.wikimedia.org/wiki/Research:Page_view
>>>
>>> Have a good weekend !
>>>
>>> --
>>> Joseph Allemandou
>>> Data Engineer @ Wikimedia Foundation
>>> IRC: joal
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
>
> --
> Joseph Allemandou
> Data Engineer @ Wikimedia Foundation
> IRC: joal
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics



--
Joseph Allemandou
Data Engineer @ Wikimedia Foundation
IRC: joal