Ian Smith wrote:
- Save the string which fails to a file and provide it.
When I save the "text" field from the "text" table to a file using MySql query browser, the offending section looks like this (the dodgy quotes are either side of "gpedit.msc" on the third line):
[snip]
Can you pull it from the actual string instead of the database? If you're already there, you can just save the string at that point in the code.
You should also check the revision's actual contents. The raw database may be using the wrong underlying encoding. MediaWiki by default is optimized for MySQL 4.0, and uses UTF-8 encoding for all data without caring what MySQL thinks it is. In MySQL 4.1 or later, this may result in a raw fetch from MySQL turning back unexpected variations on the encoding, depending on how you access it.
You can load the current revision of a particular page and save it to a file like so:
$title = Title::newFromText("My page name"); $rev = Revision::newFromTitle($title); $text = $rev->getText(); file_put_contents("outfile.txt", $text);
(You can use maintenance/eval.php to run code within the MediaWiki framework from the command line.)
-- brion vibber (brion @ wikimedia.org)