This is indeed awesome -- thanks for trying out the new tools (and all the pain involved) and also for documenting your work.


On Mon, Dec 1, 2014 at 8:07 AM, Andrew Otto <> wrote:

On Nov 30, 2014, at 16:06, Aaron Halfaker <> wrote:

Hey folks,

I just finished a blog post about how I'm incorporating hadoop streaming into my workflow.

TL;DR: I have strong opinions about Good Ways(TM) to process large datafiles in interesting ways and hadoop streaming will support them nicely.  :) 

Props to ottomata for spending a bunch of time helping me get up to speed with our cluster and to gage for making it easier to find hadoop's error messages.  

Analytics mailing list

Analytics mailing list