[Foundation-l] Querylogs and accesslogs

Gregory Maxwell gmaxwell at gmail.com
Fri Nov 24 07:36:31 UTC 2006


On 11/24/06, Antonio Gulli <gulli at di.unipi.it> wrote:
> Is wiki using apache web server or something equivalent server?
> I was referring to the access.log file

Although we use Apache, we do not store an access.log.
We also use squid, but have disabled logging in that as well.

At peak we are serving over 20,000 requests per second. At this
activity level logging would present a non-negligible performance and
administrative overhead.

Lets pretend for a moment that all access hit apache:

My local mediawiki installation on apache produces log entries of
232.13 bytes per hit on average. I would expect that my log entries
would be shorter than the entries we'd see in production.

Over a day we are receiving about 1,188,345,600 http requests.

This would be 256.9 GiB/day in access logs.

At 7.8 terabytes of log data to simply preserve a month's history,
keeping full access logs would be both unreasonable and wasteful.

If you have some especially interesting research ideas, and your
research can be done on smaller amounts of data that we might be
collecting (such as the wikicharts data) then I would be glad to
discuss the possibilities.  But it would be best to take that
discussion off list...



More information about the foundation-l mailing list