On 11/24/06, Antonio Gulli <gulli(a)di.unipi.it> wrote:
Is wiki using apache web server or something
I was referring to the access.log file
Although we use Apache, we do not store an access.log.
We also use squid, but have disabled logging in that as well.
At peak we are serving over 20,000 requests per second. At this
activity level logging would present a non-negligible performance and
Lets pretend for a moment that all access hit apache:
My local mediawiki installation on apache produces log entries of
232.13 bytes per hit on average. I would expect that my log entries
would be shorter than the entries we'd see in production.
Over a day we are receiving about 1,188,345,600 http requests.
This would be 256.9 GiB/day in access logs.
At 7.8 terabytes of log data to simply preserve a month's history,
keeping full access logs would be both unreasonable and wasteful.
If you have some especially interesting research ideas, and your
research can be done on smaller amounts of data that we might be
collecting (such as the wikicharts data) then I would be glad to
discuss the possibilities. But it would be best to take that
discussion off list...