and kafka seems to still eat the first character of the Headers in some settings
Aye, I know. I haven’t deployed Magnus’ fix yet. Since it wasn’t crucial (is it?) I was going to wait a while in case we have any other code changes to ship along with it.
On Mar 4, 2014, at 5:03 AM, Christian Aistleitner christian@quelltextlich.at wrote:
Hi,
On Mon, Mar 03, 2014 at 09:19:03PM -0500, Andrew Otto wrote:
Who generates the mobile and zero limn dashboards?
For the Wikipedia Zero limn dashboards that would be me.
I am currently generating log files from Kafka that should be useable in place of the ones that udp2log generates. It’d be cool to compare the output of the two.
Totally! Hoping Toby puts it on the agenda for the upcoming Sprint.
Have fun, Christian
P.S.: I looked at the files yesterday, and kafka seems to still eat the first character of the Headers in some settings. For example when looking for User Agents starting in “okia” (“Nokia” without leading “N”), the kafkatee file has >10K matches for the 20140303 file [1], while the udp2log file has 0 [2].
[1] ___________________________________________________________ qchris@stat1002 // 0 // 09:58:21 cwd: ~ zcat /a/log/webrequest/zero/zero.tsv.log-20140303.gz | cut -f 14 | grep -c '^okia' 11552
[2] ___________________________________________________________ qchris@stat1002 // 0 // 09:58:32 cwd: ~ zcat /a/squid/archive/zero/zero.tsv.log-20140303.gz | cut -f 14 | grep -c '^okia' 0
-- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Gruendbergstrasze 65a Email: christian@quelltextlich.at 4040 Linz, Austria Phone: +43 732 / 26 95 63 Fax: +43 732 / 26 95 63 Homepage: http://quelltextlich.at/
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics