Possibly off-topic.
Heres is a script that replace normal whitespace with one of the whitespaces supported by UTF8 ( Others are           ​  ).
I have made a few vandalization test here: http://en.wikipedia.org/wiki/User:Tei/lalaland
What do you guys think? could this be a problem? You can break links like [[Mr Thonson]] replacing it by [[Mr Thonson]]
while(<DATA>){ @chars = split(//,$_);
foreach $ch (@chars){ if ( $ch eq " "){ print pack("ccc",0xe2,0x80,0x80); }else { print $ch; } } }
__DATA__ Text to be vandalized goes here