Why hasn't Wikistats been updated since May?
Because Erik Zachte, the third party who makes them, hasn't updated his
scripts since May?
Excuse me?
In June I asked two times for new dumps, so that I could run stats once more before the conversion to Mediawiki 1.5, knowing that I would not have time for yet another upgrade of the scripts soon.
As soon as the dumps were available I tried to run the stats. It turned out that the database had already been changed even before mediawiki 1.5. Lots of archive data had been moved to another database for which no dump was available at all!! And priorities were such it was not to be expected soon. If someone bothered at all to announce that publicly, e.g. on wikitech, I missed it, but it would not have changed much, expect saving me time on finding this out myself.
Then mediawiki 1.5 upgrade occurred which brought a completely new database scheme. After a while Brion announced he would produce new dumps in xml format, which would simplify things a lot. (It does)
So I waited for that. The new dumps came about mid July when I had started to prepare for two Wikimania presentations. A week after Wikimania I started to work on updating the scripts.
The new xml dump format makes things much simpler (and hopefully more consistent), yet because I want to keep the script backwards compatible for other mediawiki installations that use it, at least for a while, it was not so simple to add code for the new dumps after all.
Let me tell you that the scripts are not exactly trivial. They contain many thousands of line of perl code. Partly because old sql dumps became nearly impossible to parse offline, but I got it working, within space constraints. Partly because they do much more than simple counting, e.g. timelines overview, category trees, wikibooks indexes, they have to process data in a certain sequence for that, and also bother to present data somewhat differently for each project.
So considering I could only start on this >> two weeks ago << I'm making fair progress.
By the way, many people would like to see for more frequent dumps. I also still do not know why dumps are not produced more often, but trust there will be a good reason, probably performance. There is no fixed schedule at all, one waits, people start asking for a new dump, often get no answer, and finally it arrives, more or less once a month these days.
Erik Zachte
wikimedia-l@lists.wikimedia.org