Re: [Analytics] [Ops] webrequest_misc added to Kafka + Hive

30 Jan 2015

Just speaking for the Kafka/Hadoop use case, you'd be perfectly able
to grep through without having to hit production-level requests; HDFS
files are very deliberately partitioned on the class of source varnish
(mobile, text, misc, upload, etc): you can just grep through the misc
files.

(Unless you meant a literal grep rather than a figurative one. In
which case, ignore this ;p)

On 30 January 2015 at 04:51, Faidon Liambotis &lt;faidon(a)wikimedia.org&gt; wrote:
...
  On Tue, Jan 27, 2015 at 01:23:10PM +0100, Christian
Aistleitner wrote:
  But if you want to make the point that misc need
not be logged and
 misc wasn't intentionally in udp2log and the 5xx tsvs, then by all
 means: Yes, agreed, let's remove it. From both kafka and udp2log.
 I am all for it. 
 I don't think it was intentional, no. Even if it was at the time, I
 think it'd be wrong to put everything into the same pool of
 logs/statistics. Production should be separate and we shouldn't have to
 grep production 5xxs in the same log that also has e.g. git.wm.org's
 5xx.

 All that said, a (separate) 5xx log of misc services can be useful, so I
 wouldn't object.

 Faidon

 _______________________________________________
 Analytics mailing list
 Analytics(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics 

-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Analytics] [Ops] webrequest_misc added to Kafka + Hive