On Tue, Dec 6, 2011 at 2:58 PM, Erik Zachte ezachte@wikimedia.org wrote:
Asher, ****
I quote Tomasz,****
" October is likely under reported by up to 25% while November is under reported by up to 50%."
****
'Likely' and 'up to' don’t sound like high level of confidence. ****
Apparently you have better figures?
For October, cp1043 stopped logging on 10/15 at 23:24:23, so it wasn't counted for a tad >50% of the month.
For November, cp1043 was logging for 35 hours, or was missing around 95% of the month.
This will be the first time ever we adjust figures manually in wikistats.
It is a line we can only cross once. So at least I'd like to be as careful as possible.****
You know when the server died. Only day number or exact time?****
Are you positively sure that both servers had equal load?
Yes, they are load balanced with unweighted rr, show the same network traffic levels in ganglia, and in sampled-1000.log-20111204.gz, one was logged 39771 and the other 39747 times, within 99.9% of each other, at 0.1% sampling.
The uncertainty here should be much less than that hanging over all of the stats due to the large but variable random logging packet loss for all wmf traffic that occurred during parts of these months.