Just an FYI here that this has been done, yay! Varnish, Nginx, and Squid frontends are
now all logging with tab as the field delimiter.
For those who would notice, for the time being, we have started outputting logs to new
filenames with .tab. in the name, so as to differentiate the format. We will most likely
change the file names back to their original names in a month or so.
On Jan 28, 2013, at 11:33 AM, Matthew Flaschen <mflaschen(a)wikimedia.org> wrote:
On 01/27/2013 08:07 AM, Erik Zachte wrote:
The code to change existing tabs into some less
obnoxious character is dead
trivial, hardly any overhead. At worst one field will then be affected, not
the whole record, which makes it easier to spot and debug the anomaly when
Scanning an input record for tabs and raising a counter is also very
efficient. Sending one alert hourly based on this counter should make us
aware soon enough when this issue needs follow-up, yet without causing
Doing both of those would be pretty robust. However, if that isn't
workable, a simple option is just to strip tab characters before
Varnish/Squid/etc. writes the line.
That means downstream code doesn't have to do anything special, and it
shouldn't affect many actual requests.
Analytics mailing list