Hi Faidon,
On Mon, Jan 26, 2015 at 04:09:32PM -0800, Andrew Otto wrote:
I’ll let qchris respond in more detail, [...]
I do not have much further details.
Currently, udp2log contains misc (not directly via varnish, but indirectly via nginx) and hence misc logs can be queried live, and they also make it onto disk. Like in oxygen's 5xx tsvs. (~1.8K misc log lines/day in the 5xx tsvs).
When preparing switching the tsvs from udp2log to kafka, the guiding principle was that the kafka-based tsvs should not unneededly discard parts of the traffic that have been in the udp2log-based tsvs before.
Hence, when recreating the 5xx tsvs using kafka, it seemed expected to continue to have misc logs in those tsvs.
But if you want to make the point that misc need not be logged and misc wasn't intentionally in udp2log and the 5xx tsvs, then by all means: Yes, agreed, let's remove it. From both kafka and udp2log. I am all for it.
The less we need to log, the better.
Have fun, Christian
P.S.: Bits and misc are quite alike in terms of logging setup [1] and from my point of view also in terms motivation for being in udp2log/kafka. Does this mean bits can/should be dropped too from udp2log/kafka for the same reasoning?
(This is totally not sarcastic. I am serious. If there is a chance of logging less, we should consider it.)
P.P.S.: There are occasional one-off requests on both misc and bits (like “Are people still requesting $DEPRECATED_URL_FOO?”) but those can also be answered through temporary means instead of permanent logging.
[1] Both are in udp2log not because of the varnishes, but because of the nginxs. But into kafka, both of them feed their varnish logs.
(Ok, kafka's bits is currently temporarily turned off. But still.)