Well, he's still suggesting we switch to tab as delimiter in sources.  Same solution, but with the extra bonus of allowing udp-filter to give our downstream consumers what they currently expect.


On May 10, 2012, at 2:08 PM, Diederik van Liere wrote:

Hi Erik,

Yes it is downwards compatible but does not outweigh the drawbacks. It's not simple, as it creates a disconnect between the configuration of the server log and the actual output. In addition, it is not a future proof solution because we also want to stream the server log data to the analytics cluster and then we will be still stuck with the same problem (as streaming the data into the analytics cluster will not depend on the udp-filter software). We should apply a real solution not a monkey patch.

D

On Thu, May 10, 2012 at 2:03 PM, Erik Zachte <ezachte@wikimedia.org> wrote:

There are more suggestions hanging in the air waiting to be shot down.

 

Character replacement in c is very cheap.

So why not feed Diederik's filter with tab delimited data, and export space delimited data?

 

The filter first replaces all (non delimiting) spaces by underscores, then replaces all (delimiting) tabs by spaces.

 

Simple, and downwards compatible.

 

Erik

 

From: analytics-bounces@lists.wikimedia.org [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Diederik van Liere
Sent: Thursday, May 10, 2012 3:57 PM
To: analytics@lists.wikimedia.org
Subject: Re: [Analytics] Using tab as delimiter instead of space in the log files

 

So far nobody has responded to my inquiry on whether they would be affected by this chance. So please let us know if you are consuming a server log and you are expecting spaces as delimiters. We want to make sure that we are aware of all the people that will be affected by this.

 

Best,

Diederik


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics