Diederik and Evan D — Sent from Mailbox for iPhone
On Mon, Jul 22, 2013 at 9:27 AM, Christian Aistleitner christian@quelltextlich.at wrote:
Hi, when doing some basic sanity checks between the output of the existing zero_country and zero_carrier Pig scripts, it seems that the sum of the number of requests of the output of zero_country per day is ~40k larger than for zero_carrier. First, I've been told that the sum of the number of requests has to match. Afterwards, I've been told that this is ok, as zero_country should hold all of the mobile requests from a country, and zero_carrier is a drill-down on the specific carriers. When reading the Pig scripts/Java code, it is obvious that the first explanation does not meet the code. The scripts take completely different paths through our code base and count completely different things :-( However, the latter explanation does not make much sense to me either, as it's hard to believe that the requests from our zero partners make up >90% of each countries mobile requests. Besides, this explanation would not meet how we generate the raw log files. Whom could I ask about what the desired semantics of zero_{carrier,country} are? Best regards, Christian -- ---- quelltextlich e.U. ---- \ ---- Christian Aistleitner ---- Companies' registry: 360296y in Linz Christian Aistleitner Gruendbergstrasze 65a Email: christian@quelltextlich.at 4040 Linz, Austria Phone: +43 732 / 26 95 63 Fax: +43 732 / 26 95 63 Homepage: http://quelltextlich.at/