On Tue, Aug 15, 2006 at 02:04:37PM +0200, Jens Frank wrote:
On Mon, Aug 14, 2006 at 01:56:13PM -0700, Aerik Sylvan wrote:
Do I correctly remember that mediawiki projects do not keep log files and statistics and stuff (other than the ones that can be gleaned from the database itself) to reduce server load? I think I remember somebody saying the even log files are not kept... or that could have been some other reality.
Well, it's the reality I'm in. No log files are being written on the squids or the apaches.
...
So - hitting an external log with the standard client issued image / javascript type counter thingy would get some of that. More data could be gleaned by combining that with the info available from the database.
Has this already been discussed/beaten to death? Is it a dumb idea?
Some Wikipedias have systems like this set up. According to the statistics of the German Wikipedia, a user's discussion page is the second most visited page, with 10% less hits than the main page. So, apparently this method is not producing very reliable data.
Than, there must be some flaw in the implementation. Othervise the method (running some client-side javascript contacting statistics server) is industry standard (e.g. Google Analytics works this way).
Moreover, it could be easily done with modest hardware and existing statistics software. It's unnecessary to process the whole huge dataset - it would be enough to process random selection. Which can be easily done with JS ... the JS code would contact the statics server if some $random>0.999 for example.
Jan Kulveit