On Thu, Oct 21, 2010 at 5:28 PM, Neil Kandalgaonkar <neilk(a)wikimedia.org>wrote;wrote:
I feel that this has to be a symptom of some other
problem. What sort of
things go into "live hacks"?
If they are about rapidly reconfiguring, rolling back, or turning off
features, I think that's better answered by having an explicit system to
do such a thing (see my other post in this thread about Flickr's system).
Primarily:
1) debug logging statements to provide additional information on problems
seen in production that can't yet be reproduced offline
2) temporary performance hacks to disable individual code paths in
particular circumstances (say, the caching bug that caused serious cache
contention on the 'Michael Jackson' article one day) -- these are usually
not "features" but more like "this chunk of processing for this feature
when
used in a very particular way on this one article"
3) horrible, horrible temporary hacks to block particularly unpleasant
actions or make exceptions for something that other code doesn't yet allow.
These are usually done live because live because whatever you're reacting to
is live -- the code is part of a production debugging session.
Debug logging hacks usually are discardable immediately. Performance hacks
usually need to be maintained or replaced with better code -- these are the
ones we had to worry about not accidentally losing by replacing the live
deployment with code from trunk. :) Temporary hacks to disable or enable
things or help catch vandalism are sort of an in-between space.
-- brion