On Thu, Mar 07, 2002 at 11:34:53PM -0800, Brion L. VIBBER wrote:
On ??a??, 2002-03-07 at 22:07, Tomasz Wegrzanowski wrote:
Just see articles about anything Japanese on English Wikipedia. They contain Japanese names of everything.
Sure, but more often kanji than kana, so special kana markup wouldn't be that big a win. See the thread "International Upgrades"; the vague plan is to standardise the internal character set and present the wikipedias in Unicode to capable browsers. (Please comment!)
Uhm, right. But most non-japanese people don't know names of too many kanjis, so kanjis aren't that important. ;) On the other hand more people that it is usually though know kana, so it might be beneficial for them.
Hmmm. Now I think that some general method would be more useful: &katakana_a; &kanji_b; &hebrew_c; or &cyrilic_d;
I think that it won't need too many changes in parser. Perl code: Init:
%Entities = {'&katakana_o;' => 'オ', ... };
On HTML output:
s/(&[a-zA-Z0-9_]+;)/$Entities{$x}?$Entities{$x}:$x;/eg;
As a result, we should be able to use the customary input methods or cut-n-paste to put any characters into any of the wikis, which is certainly a lot easier than looking up entities or running text through a UTF-8-to-entities convertor (which is what I currently do).
-- brion vibber (brion @ pobox.com)
Hmmm. Wouldn't that need some modifications to browsers ?