Over the last week I've created hive tables for many of our larger datasets in Hadoop. Those were used to generate many of the results you've seen in the last few days.
Both the schemas for those tables and the job-scripts can be found in: - https://github.com/wikimedia/kraken/tree/master/hive
Questions welcome. -- David Schoonover dsc@wikimedia.org
Looking at this script: https://github.com/wikimedia/kraken/blob/master/hive/scripts/top_sessions.sq... I see two versions, and I'm guessing you're poking around to see how the optimizer handles that. Did one of them end up being faster?
On Mon, Apr 29, 2013 at 4:31 PM, David Schoonover dsc@wikimedia.org wrote:
Over the last week I've created hive tables for many of our larger datasets in Hadoop. Those were used to generate many of the results you've seen in the last few days.
Both the schemas for those tables and the job-scripts can be found in:
Questions welcome.
David Schoonover dsc@wikimedia.org
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
I wasn't sure if it would let me order on a synthetic field. The latter one is better.
-- David Schoonover dsc@wikimedia.org
On Tue, Apr 30, 2013 at 8:52 AM, Dan Andreescu dandreescu@wikimedia.orgwrote:
Looking at this script: https://github.com/wikimedia/kraken/blob/master/hive/scripts/top_sessions.sq... I see two versions, and I'm guessing you're poking around to see how the optimizer handles that. Did one of them end up being faster?
On Mon, Apr 29, 2013 at 4:31 PM, David Schoonover dsc@wikimedia.orgwrote:
Over the last week I've created hive tables for many of our larger datasets in Hadoop. Those were used to generate many of the results you've seen in the last few days.
Both the schemas for those tables and the job-scripts can be found in:
Questions welcome.
David Schoonover dsc@wikimedia.org
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics