Hey folks,
TL;DR: I have strong opinions about Good Ways(TM) to process large datafiles in interesting ways and hadoop streaming will support them nicely. :)
Props to ottomata for spending a bunch of time helping me get up to speed with our cluster and to gage for making it easier to find hadoop's error messages.
-Aaron