On ven, 2002-03-08 at 10:26, Tomasz Wegrzanowski wrote:
On Fri, Mar 08, 2002 at 10:07:58AM -0800, Brion L. VIBBER wrote:
That said, I'm still not convinced there's much usefulness in more special codes that work on our wiki and nowhere else in the world; only a fraction of the above use kana at all.
*Anything* is better than using numerics. Many of them are mixed kana + kanji. That still saves half of the work.
What are you doing, looking up every character individually? No wonder you're having trouble!
What I currently do is to type the desired text into yudit (http://yudit.org) using its support for the kinput2 input method (or cut-n-paste into yudit from another web page), save the file, and run it through this little program:
#!/usr/bin/perl -p # disassemble non-ASCII codes from UTF-8 stream
# borrowed from http://czyborra.com/utf/
#$format=$ENV{"UCFORMAT"}||'<U%04X>'; $format='&#%d;'; s/([\xC0-\xDF])([\x80-\xBF])/sprintf($format, unpack("c",$1)<<6&0x07C0|unpack("c",$2)&0x003F)/ge; s/([\xE0-\xEF])([\x80-\xBF])([\x80-\xBF])/sprintf($format, unpack("c",$1)<<12&0xF000|unpack("c",$2)<<6&0x0FC0|unpack("c",$3)&0x003F)/ge; s/([\xF0-\xF7])([\x80-\xBF])([\x80-\xBF])([\x80-\xBF])/sprintf($format, unpack("c",$1)<<18&0x1C0000|unpack("c",$2)<<12&0x3F000| unpack("c",$3)<<6&0x0FC0|unpack("c",$4)&0x003F)/ge;
Paste the output into the Wikipedia edit box, and presto!
If I have one name, it gets done at once. If I have two names, they get done at once. If I put in a whole passage of text, it all gets done at once. It would actually be *more* work for me to separately write out the kana characters in special codes.
Once we've got the new system with Unicode up, you should be able to type or paste the characters in directly (unless you have a very limited browser, see my earlier post) and bypass all this rigamarole.
-- brion vibber (brion @ pobox.com)