Cool!  Let's say I want to review the filters and apply them in a python script.  What should I reference?  

On Wed, Jan 7, 2015 at 5:13 PM, Oliver Keyes <okeyes@wikimedia.org> wrote:
I'm pleased to say we now have the prototype pageviews definition as a UDF!

For those with cluster access:

CREATE TEMPORARY FUNCTION pageview as
'org.wikimedia.analytics.refinery.hive.isPageviewUDF';

...and then just apply it. It outputs a boolean, so you can easily go
WHERE is.Pageview(fields) and treat it as a conditional. Great
success!

What this means for the definition is twofold; it means it's a lot
easier to tests it accuracy, and it means that it's a lot easier to
make sure we're all using the same definition going forward. Once we
have the legacy definition as a UDF, refining and testing will proceed
at great speed, although I encourage anyone with time on their hands
who wants to help out to do some testing of their own :)

--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics