On Thu, Oct 21, 2010 at 5:28 PM, Neil Kandalgaonkar neilk@wikimedia.orgwrote:
I feel that this has to be a symptom of some other problem. What sort of things go into "live hacks"?
If they are about rapidly reconfiguring, rolling back, or turning off features, I think that's better answered by having an explicit system to do such a thing (see my other post in this thread about Flickr's system).
Primarily:
1) debug logging statements to provide additional information on problems seen in production that can't yet be reproduced offline 2) temporary performance hacks to disable individual code paths in particular circumstances (say, the caching bug that caused serious cache contention on the 'Michael Jackson' article one day) -- these are usually not "features" but more like "this chunk of processing for this feature when used in a very particular way on this one article" 3) horrible, horrible temporary hacks to block particularly unpleasant actions or make exceptions for something that other code doesn't yet allow.
These are usually done live because live because whatever you're reacting to is live -- the code is part of a production debugging session.
Debug logging hacks usually are discardable immediately. Performance hacks usually need to be maintained or replaced with better code -- these are the ones we had to worry about not accidentally losing by replacing the live deployment with code from trunk. :) Temporary hacks to disable or enable things or help catch vandalism are sort of an in-between space.
-- brion