Thanks.
Is there any script (sql, PHP, Perl ..) that I could run (or write) that
could update my local table "old" that would replace the compressed old_text
fields by the uncompressed ones?
Indeed, the research project on which I work requires me to know the size of
each edit or comment in the table "old".
Kevin Carillo
Kevin Carillo wrote:
I've successfully imported both cur and old
tables.
However, I am realizing that the text parts in the "old" table seem to be
in
a compressed format.
I do not know how Mediawiki is able to uncompress it and display it
properly.
but I'd like to have access the the column "old_text" through SQL
queries,
how could I do that?
The old_flags field indicates the format of the old_text field: whether
it's compressed, whether it's a serialized object, etc.
The simplest thing to do is to work within the MediaWiki framework and
use Article::getRevisionText() on the retrieved database row: this will
run the appropriate decompression and will unserialize and extract text
from HistoryBlob and HistoryBlobStub object rows.
You might also check Erik Zachte's statistics scripts, a perl-based
package, to see how he's reading data.
After we've gotten MediaWiki 1.5 in place over the next couple weeks
we'll be providing a simple XML wrapper format for the backup dumps
which won't be nearly as messy to deal with. (This is the same format
used by Special:Export.) This will include an importer tool which could
be used to import all the data to a local database without compression.
-- brion vibber (brion @ <mailto:>
pobox.com)