One thing I think I will add is a text byte size field
on the revision
table; with individual-revision compression we no longer can easily get
the size short of decompressing the text to see what it looks like.
Generally this size will not change, either, since a given revision's
source text is immutable.
I'll need to parse the full article text anyway, for several stats.
Number of int/ext links, image links, word count, ...
If I run directly on the database the job will run for days.
Fine with me, but heavy queries are a problem every now and then, or no
I'd better make the counts job incremental then.
It is a bit less flexible and more error prone on script updates,
but it can be done. Any idea when the new scheme will be implemented?