On Thu, Mar 07, 2002 at 11:34:53PM -0800, Brion L. VIBBER wrote:
On ??a??, 2002-03-07 at 22:07, Tomasz Wegrzanowski
wrote:
Just see articles about anything Japanese on
English Wikipedia.
They contain Japanese names of everything.
Sure, but more often kanji than kana, so special kana markup wouldn't be
that big a win. See the thread "International Upgrades"; the vague plan
is to standardise the internal character set and present the wikipedias
in Unicode to capable browsers. (Please comment!)
Uhm, right. But most non-japanese people don't know names of too many kanjis,
so kanjis aren't that important. ;) On the other hand more people that
it is usually though know kana, so it might be beneficial for them.
Hmmm. Now I think that some general method would be more useful:
&katakana_a; &kanji_b; &hebrew_c; or &cyrilic_d;
I think that it won't need too many changes in parser.
Perl code:
Init:
%Entities = {'&katakana_o;' => 'オ',
...
};
On HTML output:
s/(&[a-zA-Z0-9_]+;)/$Entities{$x}?$Entities{$x}:$x;/eg;
As a result, we should be able to use the customary
input methods or
cut-n-paste to put any characters into any of the wikis, which is
certainly a lot easier than looking up entities or running text through
a UTF-8-to-entities convertor (which is what I currently do).
-- brion vibber (brion @
pobox.com)
Hmmm. Wouldn't that need some modifications to browsers ?