On Mon, Apr 26, 2010 at 5:52 PM, Platonides <Platonides(a)gmail.com> wrote:
Anthony wrote:
What kind of space needs are we talking about?
100k requests per second.
Assuming that an url is 50 bytes on average, that's 432 GB per day (the
usual apache log line is about 1.5 times that).
Seems reasonable. For 3 days of access that's 18 gigs per server over 70
servers.
And that's without compression, and 50 bytes seems awfully long for a URL.
What if
your referer was your facebook personal page leaking your
full
real name?
And what if you're in the sample? I find it quite inappropriate that
even sampled data like this is being released.
The referer is not stored anywhere.
Well, that's good to hear. What exactly is contained in the sampled data
which is being released? We've heard what's in the 1/10th sample Mr.
Priedhorsky is getting, but what about the rest?