Just speaking for the Kafka/Hadoop use case, you'd be perfectly able
to grep through without having to hit production-level requests; HDFS
files are very deliberately partitioned on the class of source varnish
(mobile, text, misc, upload, etc): you can just grep through the misc
files.
(Unless you meant a literal grep rather than a figurative one. In
which case, ignore this ;p)
On 30 January 2015 at 04:51, Faidon Liambotis <faidon(a)wikimedia.org> wrote:
On Tue, Jan 27, 2015 at 01:23:10PM +0100, Christian
Aistleitner wrote:
But if you want to make the point that misc need
not be logged and
misc wasn't intentionally in udp2log and the 5xx tsvs, then by all
means: Yes, agreed, let's remove it. From both kafka and udp2log.
I am all for it.
I don't think it was intentional, no. Even if it was at the time, I
think it'd be wrong to put everything into the same pool of
logs/statistics. Production should be separate and we shouldn't have to
grep production 5xxs in the same log that also has e.g. git.wm.org's
5xx.
All that said, a (separate) 5xx log of misc services can be useful, so I
wouldn't object.
Faidon
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Research Analyst
Wikimedia Foundation