2009/6/4 Michael Rosenthal rosenthal3000@googlemail.com:
I suggest keep the bug on Wikimedia's servers and using a tool which relies on SQL databases. These could be shared with the toolserver where the "official" version of the analysis tool runs and users are enabled to run their own queries (so taking a tool with a good database structure would be nice). With that the toolserver users could set up their own cool tools on that data.
I understand the problem with stats before was that the stats server would melt under the load. Leon's old wikistats page sampled 1:1000. The current stats (on dammit.lt and served up nicely on http://stats.grok.se) are every hit, but I understand (Domas?) that it was quite a bit of work to get the firehose of data in such a form as not to melt the receiving server trying to process it.
OK, then the problem becomes: how to set up something like stats.grok.se feasibly internally for all the other data gathered from a hit? (Modulo stuff that needs to be blanked per privacy policy.)
- d.