[MediaWiki-l] Hidden categories: page_props consistence

Bartosz Dziewoński matma.rex at gmail.com
Thu Jul 20 01:25:05 UTC 2017


All text in MediaWiki's database uses MySQL's "BINARY" encoding by 
default, even though it's encoded in UTF-8, for historical reasons 
(MySQL's UTF-8 support used to be horribly broken).

The software you're using to view it therefore decides to show you the 
binary bytes, rather than the actual text. I've seen this annoying 
behavior in some versions of PHPMyAdmin and MySQLWorkbench, I don't know 
what you're using.

You can "decode" the value '68696464656e636174' based on the ASCII 
values for the hex codes:

0x68 = 'h'
0x69 = 'i'
0x64 = 'd'
0x64 = 'd'
0x65 = 'e'
0x6e = 'n'
0x63 = 'c'
0x61 = 'a'
0x74 = 't'
(https://en.wikipedia.org/wiki/ASCII#Printable_characters)

Or, you can cast it to the 'char' type in your query to make your 
software behave:

   SELECT CAST(pp_propname as char) FROM page_props;

The actual data stored in your database has been 'hiddencat' all along,
and MediaWiki doesn't need to do anything special to get that. The 
problem is with the viewer software being too clever.

-- 
Bartosz Dziewoński



More information about the MediaWiki-l mailing list