Looking at this script:
https://github.com/wikimedia/kraken/blob/master/hive/scripts/top_sessions.s…
I see two versions, and I'm guessing you're poking around to see how the
optimizer handles that. Did one of them end up being faster?
On Mon, Apr 29, 2013 at 4:31 PM, David Schoonover <dsc(a)wikimedia.org> wrote:
Over the last week I've created hive tables for
many of our larger
datasets in Hadoop. Those were used to generate many of the results you've
seen in the last few days.
Both the schemas for those tables and the job-scripts can be found in:
-
https://github.com/wikimedia/kraken/tree/master/hive
Questions welcome.
--
David Schoonover
dsc(a)wikimedia.org
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics