Ian Smith wrote:
2) Save the
string which fails to a file and provide it.
When I save the "text" field from the "text" table to a file using
MySql
query browser, the offending section looks like this (the dodgy quotes
are either side of "gpedit.msc" on the third line):
[snip]
Can you pull it from the actual string instead of the database? If
you're already there, you can just save the string at that point in the
code.
You should also check the revision's actual contents. The raw database
may be using the wrong underlying encoding. MediaWiki by default is
optimized for MySQL 4.0, and uses UTF-8 encoding for all data without
caring what MySQL thinks it is. In MySQL 4.1 or later, this may result
in a raw fetch from MySQL turning back unexpected variations on the
encoding, depending on how you access it.
You can load the current revision of a particular page and save it to a
file like so:
$title = Title::newFromText("My page name");
$rev = Revision::newFromTitle($title);
$text = $rev->getText();
file_put_contents("outfile.txt", $text);
(You can use maintenance/eval.php to run code within the MediaWiki
framework from the command line.)
-- brion vibber (brion @
wikimedia.org)