Hi Katie,
Could you please give a bit more details regarding  "significant amount of havoc with various crucial systems we have in place".
Thanks!
Diederik


On Tue, Nov 13, 2012 at 3:58 PM, Katie Horn <khorn@wikimedia.org> wrote:
Hi Andrew,

We are almost completely sure that this set of changes would cause a significant amount of havoc with various crucial systems we have in place, and we definitely don't have time to shake bugs out of those systems at this point, as they are all already in heavy use. The only good time to deploy those changes in the forseeable future would be (approximately) January.

Sorry about the bad news,
-Katie



On Tue, Nov 13, 2012 at 12:10 PM, Andrew Otto <otto@wikimedia.org> wrote:
Hi guys!

So, we've had a Todo on our list for a while now to make a couple of tweaks to the web access log format coming from squid, varnish and nginx.  

1. Append Accept-Language and X-Carrier headers.
This brings the field count from 14 up to 16.  udp-filter has already been modified to handle this.  I've already got a change in for this:  https://gerrit.wikimedia.org/r/#/c/12188/

2.  Change field separator from space to tab.
User-Agent and Content-Type headers (and possibly others) sometimes contain spaces.  Some sources (e.g. varnish) properly URL encode the fields before they are sent out, but others don't.  Using tab as the field separator in web access logs will avoid many of these issues.

We have wanted to do this for a while, but haven't because we were worried about breaking Erik Zachte's wikistats scripts.  Stefan Petrea is now working with Diederik on wikistats (and other things), and has dealt with this issue.  So!  We are ready!  We'd like to make this change before we start real consumption of the web access logs into the Kraken cluster, which hopefully will be relatively soon.  

Would these changes cause Fundraising any foreseeable problems?  Can we go ahead and work with ops to push this through?

Thanks!
-Andrew Otto





_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics