[WikiEN-l] Tagging foreign language text with templates

Matthew Brown morven at gmail.com
Tue May 15 19:50:45 UTC 2007


On 5/15/07, Ray Saintonge <saintonge at telus.net> wrote:
> Matthew Brown wrote:
> >People have been doing this to Japanese for a while - see {{nihongo}}.
> >[...]
> So transliterate them unless there is already an established English
> form.  The purpose of original script is for people to be able to follow
> the matter up in that language.  An interwiki link to the WP in that
> language or to Wiktionary would at least have some usefulness.

Did you even look at what that template does?  It simply enforces a
standard format of "<English translation> (<Japanese script>
<transliterated Japanese>)" for such things.

> >and a standard format for putting the different
> >forms is nice.
> >
> Why?

For the same reason that we use templates in general: to improve consistency.

> >The point of this template is that it marks up the language used so
> >that it can be displayed or spoken correctly.  It encloses it in <span
> >lang="language"></span> tags.
> >
> It just puts tags around it.  How is that going to get things pronounced
> "correctly".  Is that even needed?  Do we tag mathematical or musical
> expressions for proper pronunciation?

It puts <span lang="language"></span> tags around it.  This enables
browsers to display it correctly for the language in question, enables
screen readers for the blind to read it correctly instead of
attempting to pronounce it as an English word, gives useful hints for
translation software, etc etc.

> >It's important to note that Unicode does not encode the language, just
> >the characters.
> >
> That's as it should be.

Arguably yes, but it means that some things can't be done in Unicode
(e.g. displaying Chinese-origin characters correctly for the language
they are written in). Notation outside of Unicode is needed to specify
those things.

> >Read up on [[Han unification]] to understand the
> >problems this gives with characters deemed the same across multiple
> >Asian languages even if the characters are actually written quite
> >differently when used to write Japanese vs. Chinese, for instance.
> >
> In situations where this matters the people involved already have a
> reasonable knowledge of the language(s) involved.

Are you talking about Wikipedia's writers or readers here?  Displaying
things correctly for our readers is important, I'd have thought.

> >As well as display/typography issues not handled in Unicode, this also
> >allows screen readers and the like to have a better chance of
> >understanding words in different languages.
> >
> Wikipedia is not a dictionary.

Um, where does this follow from what I wrote?  We're not talking about
dictionary definitions.  We're talking about e.g. screen readers
(talking web-browsers for the sight-impaired) getting foreign words
right, or at least having a chance of so doing.

> >It's certainly neater than using the HTML, but it's not exactly 100%
> >intuitive either.  I'm torn on this one; the more complicated Wiki
> >markup becomes, the less friendly it is, but on the other hand, it's
> >not good to lose information either.
> >
> Our markups are already overcomplicated.  The last thing we need is more
> geekish imperialism.  Omitting this does not lose any notable
> information at all. If all details in an article would go to this level
> of minutiae all of them would be much longer and much more boring.

It loses the correct markup in HTML for text that's not written in the
default language for the rest of the page.  This at least has some
level of importance, as I've detailed above.

I agree with you that the harder our markup gets the harder it is to
write Wikipedia articles.  On the other hand, nobody is mandating that
people MUST use these templates to include foreign-language text or
words.  Gnomish editors will fix things later, as always happens, and
if someone comes across one of these elements it seems fairly obvious
to me.

-Matt



More information about the WikiEN-l mailing list