Hi Ariel,
On Wed, Feb 08, 2012 at 12:50:25PM +0200, Ariel T. Glenn wrote:
I don't know if there are any texts stored in the text table directly that contain multiple compressed revisions, on the production cluster.
Ok. Thanks.
There are certainly some revision texts not stored in external store, which consist of serialized objcts (perhaps broken) or gzipped data.
Local objects (with a single text), local gzip, and combinations thereof are fine. I had already tested them.
By reading more code I now see that I misunderstood the 'object' description of maintainance/tables.sql.
Just for the archives: The description
text field contained a serialized PHP object. object either contains multiple versions compressed to achieve a better compression ratio, or it refers to another row where the text can be found.
does not mean that the 'object' flag (for locally stored rows) allows to refer to a different row.
The situation is rather the following
+---------+----------+--------------+ | old_id | old_text | old_flags | +---------+----------+--------------+ | 1 | SER_OBJA | object,utf-8 | | 2 | SER_OBJB | object,utf-8 | | 3 | SER_OBJC | object,utf-8 | +---------+----------+--------------+
SER_OBJA is a serialized object, whose unserialized representation's method getText() yields the old_text for old_id 1.
SER_OBJB is a serialized object, whose unserialized representation's method getText() yields the old_text for old_id 2.
SER_OBJC is a serialized object, whose unserialized representation's method getText() yields the old_text for old_id 3.
Nothing fancy. End of story.
However, it's up to the objects, how to implement getText().
So for example, SER_OBJA may have further methods (e.g.: getItem( int hash ) ), and can make further text available through this additional method.
Then, SER_OBJB can act as proxy, proxying a call to (unserialize(SER_OBJA))->getItem( SOME_CONSTANT_HASH ) by it's (i.e.: SER_OBJB) own parameterless getText() method. The magic of fetching SER_OBJA, unserializing, ...) is hidden within SER_OBJB.
Furthermore, SER_OBJC may again act as proxy, just as SER_OBJB did, but SER_OBJC will likely use different SOME_CONSTANT_HASH.
Typically SER_OBJA would be a ConcatenatedGzipHistoryBlob, and SER_OBJB and SER_OBJC would be HistoryBlobStub s.
Hence, the description in tables.sql is somewhat accurate, but overly detailed and thereby hinting in a different direction (for local objects). It seems that all the flag 'object' denotes is: 1. unserialize. 2. call getText().
That's a terrific page. Too bad I did not find it myself :(
Thanks a lot for your help.
Kind regards, Christian