Apologies for crossposting
The Analytics Team is planning to deploy "tab as field delimiter" to
replace the current space as fielddelimiter on the varnish/squid/nginx
servers. We would like to do this on February 1st. The reason for this
change is that we need to have a consistent number of fields in each
webrequest log line. Right now, some fields contain spaces and that require
a lot of post-processing cleanup and slows down the generation of reports.
What is affected and maintained by Analytics
* udp-filter already has support for the tab character
* webstatscollector: we compiled a new version of filter to add support for
the tab character
* wikistats: we will fix the scripts on an ongoing basis.
* udp2log: we have a patch ready for inserting sequence numbers separated
In particular, I would like to have feedback to three questions:
1) Are there important reasons not to use tab as field delimiter?
2) Are there important pieces of logging that expect a space instead of a
tab and that need to be fixed and that I did not mention in this email?
3) Is February 1st a good date to deploy this change? (Assuming that all
preps are finished)