One thing I think I will add is a text byte size field on the revision table; with individual-revision compression we no longer can easily get the size short of decompressing the text to see what it looks like. Generally this size will not change, either, since a given revision's source text is immutable.
I'll need to parse the full article text anyway, for several stats. Number of int/ext links, image links, word count, ...
If I run directly on the database the job will run for days. Fine with me, but heavy queries are a problem every now and then, or no more?
I'd better make the counts job incremental then. It is a bit less flexible and more error prone on script updates, but it can be done. Any idea when the new scheme will be implemented?
Erik Zachte