Brion has recently added code to store articles in 'old' SQL table in compressed format, so I will need to adjust the scripts for the international stats.
I spent several hours on it, and despite some useful tips from Brion I can't get those article data inflated, all I get is a Z_DATA_ERROR (-3)
Brion sent me a small sample of the articles in the fr: 'old' dump in compressed raw format, without escape sequences and other fields, just article data. Even this I could not tackle.
Brion wrote:
Here's a zip file containing the raw bytes of compressed old_text from
the first up to 100 columns in the table:
http://leuksman.com/misc/raw.zip
They do decompress with gzdeflate() in PHP.
Here is my test script #!/usr/bin/perl
use CGI::Carp qw(fatalsToBrowser); use Compress::Zlib;
$path = "raw/" ; ($refinf, $status) = inflateInit();
for ($i = 1 ; $i <= 100 ; $i++) { &ReadFile ($i) ; }
exit ;
sub ReadFile { $file_in = $path . "old-" . $i . ".raw" ;
open "FILE_IN", "<", $file_in || die ("Input file " . $file_in . " could not be opened.") ; binmode FILE_IN ; $article = "" ; while ($line = <FILE_IN>) { chomp ($line) ; $article .= $line ; }
($article2, $status) = $refinf->inflate ($article) ; if ($status == Z_OK) # Z_OK = 0 { print "$i:OK: " . substr ($article2,0,50) . "\n" ; } else { print "$i:Unzip error: $status\n" ; } # Z_DATA_ERROR = -3 }
Can someone help me out with this? I can deflate/inflate dummy texts, so libraries are all in place. (I use ActivePerl 5.8, on Windows) ------------------------------------------------------------- There is a second problem, possibly trivial after problem above has been solved: (well actually I hope the problem above is a trivial oversight of mine too)
The SQL dump contains escape sequences: A small section of fr: old dump, new style, that Brion sent me contains \Z: 3541 times \: 3497 ": 3428 \n: 3598 \r: 3550 \0: 3190
\Z is not listed on http://www.mysql.com/doc/en/String_syntax.html I could not find any other doc referring to it. \z is listed, so maybe upper/lower makes no difference, but I doubt it.
Anyone encountered this before?
Thanks for any help.
Erik Zachte