On ven, 2002-03-08 at 10:26, Tomasz Wegrzanowski wrote:
On Fri, Mar 08, 2002 at 10:07:58AM -0800, Brion L.
VIBBER wrote:
That said, I'm still not convinced
there's much usefulness in more
special codes that work on our wiki and nowhere else in the world; only
a fraction of the above use kana at all.
*Anything* is better than using numerics.
Many of them are mixed kana + kanji.
That still saves half of the work.
What are you doing, looking up every character individually? No wonder
you're having trouble!
What I currently do is to type the desired text into yudit
(
http://yudit.org) using its support for the kinput2 input method (or
cut-n-paste into yudit from another web page), save the file, and run it
through this little program:
#!/usr/bin/perl -p
# disassemble non-ASCII codes from UTF-8 stream
# borrowed from
http://czyborra.com/utf/
#$format=$ENV{"UCFORMAT"}||'<U%04X>';
$format='&#%d;';
s/([\xC0-\xDF])([\x80-\xBF])/sprintf($format,
unpack("c",$1)<<6&0x07C0|unpack("c",$2)&0x003F)/ge;
s/([\xE0-\xEF])([\x80-\xBF])([\x80-\xBF])/sprintf($format,
unpack("c",$1)<<12&0xF000|unpack("c",$2)<<6&0x0FC0|unpack("c",$3)&0x003F)/ge;
s/([\xF0-\xF7])([\x80-\xBF])([\x80-\xBF])([\x80-\xBF])/sprintf($format,
unpack("c",$1)<<18&0x1C0000|unpack("c",$2)<<12&0x3F000|
unpack("c",$3)<<6&0x0FC0|unpack("c",$4)&0x003F)/ge;
Paste the output into the Wikipedia edit box, and presto!
If I have one name, it gets done at once. If I have two names, they get
done at once. If I put in a whole passage of text, it all gets done at
once. It would actually be *more* work for me to separately write out
the kana characters in special codes.
Once we've got the new system with Unicode up, you should be able to
type or paste the characters in directly (unless you have a very limited
browser, see my earlier post) and bypass all this rigamarole.
-- brion vibber (brion @
pobox.com)