Our community wishes to understand the access patterns of our wiki. There seems to be quite a few special pages and extensions that tell us how many edits occur (both globally and individually). However, after looking around a little bit, I could find no software that would analyze how many pages are accessed on a daily basis by readers (not editors).
Software such as Webalizer count web page access using the squid/apache log files. But those logs include entries that cannot be considered actual content accesses. For example, our apache logs are densely littered with lines such as:
... "GET /skins/common/commonPrint.css?270 HTTP/1.0" ... ... "GET /skins/common/shared.css?270 HTTP/1.0" ... ... "GET /skins/Pinkwich5/main.css?270 HTTP/1.0" ... ... "GET /skins/common/mwsuggest.js?270 HTTP/1.0" ... ... "GET /skins/common/ajax.js?270 HTTP/1.0" ... ... ... ... "GET /wiki?title=MediaWiki:Common.css ... ... "GET /wiki?title=MediaWiki:Pinkwich5.css ... ... "GET /wiki?title=MediaWiki:Print.css
These GET requests do not represent content page accesses.
I am looking for software that counts page accesses and has the capability to eliminate page edits and perhaps even limit counts to non- logged in users. I also need software that will automatically generate reports that are (automatically) embedded in the wiki (using a Special page or other approach).
mediawiki-l@lists.wikimedia.org