On Sat, May 17, 2008 at 1:29 PM, Domas Mituzas midom.lists@gmail.com wrote:
Helloes,
There're few new things at http://dammit.lt/wikistats/
- All projects are included. Non-wikipedia projects will have suffix
in raw data. Suffixes are pretty much self explanatory (haha).
wiktionary: .d wikinews: .n wikimedia: .m (meta, commons et al) wikibooks: .b wikisource: .s mediawiki: .w wikiversity: .v wikiquote: .q
- For lazy people there will be daily packages, which will:
- Have a single .tgz archive with per-project files inside (no more
splitting!)
- Um, daily aggregation, instead of hourly
- Pages with low number of reads will not be included (need to have
at least 10 daily visits to be included)
- Files are generally much much smaller ( enwiki daily compressed
filtered data is just 5MB )
For now build process will go back just a week, but over time the archive may become bigger. This will also reduce the hourly data retention (unless archive.org or someone wishes to archive everything)
I'll be also in process of upgrading my box (or maybe moving to new shiny stats server we may get some day :) - cause it takes an hour to actually process the data on my 3-year-old flake :)
- Second number is now actually bytes, in case anyone is interested :)
I've been getting various feedback lately from non-wiki world, where people use this data for popularity ranking of various bits.
BR,
Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Nice to see them all! What also would be nice is search statistics. Currently only Special:Search/* can be found, whereas the major part of the searches is via index.php?title=Special:Search=&search=xyz or Special:Search?search=xyz.
Bryan