Hi Andrew,
On Mon, Jul 22, 2013 at 12:43:09PM -0400, Andrew Otto wrote:
I think most of the logic you are referring two is built into the two pageview UDFs, right?
Yes. The problem however is that the UDFs filter different things.
So consider for example a request to a search page. The UDF of zero_carrier.pig allow to count that. The UDF of zero_country.pig would filter that away.
An example for the other direction is http://ar.m.wikipedia.org/w/index.php?title=%D9%85%D9%84%D9%81:Abha_01.jpg&a... a request for that would be counted for zero_country.pig, but now zero_carrier.pig's UDF would filter that away.
So none of the counted rows of either of the two scripts is a subset of the other :-(
Best regards, Christian