Re: [WikiEN-l] Pronunciations and IPA/SAMPA

4 Sep 2003

      This is probably the most well thought out addressing
of this issue ever done on wp. I must say this is
impressive and inline with the consensus of
No unicode IPA on IE?? Hmm. Well, considering the
expensive workarounds you listed -- as necessary to
accomodate IE users -- for a fix that entirely in
Microsoft's domain, I would lean toward calling the
IPA Unicode as "standard" anyway, and let the ?? or
Xboxes be the problem of the IE end user. This is
already the case for any character sets that arent
loaded up anyway -- (I have yet to load a Hindi
character set for example. ;)  Soon afterward someone
will write a hack to accomodate IE no doubt, but
theres no reason not to push the Unicode IPA as the
standard right now.
But that still doesnt deal with the problem of easy
input via a Roman character set.  A little conversion
hack from the pseudovalues (/s/) to their IPA
equvalents should be a first priority , and I would do
it myself if I had the time, or could program a little
better (late bloomer ok..)
As always with apologies to the hackers,
-S-
--- David Friedland david@nohat.net wrote:
...
There was some talk a while back about deciding on a
standard method of 
indicating pronunciations on Wikipedia. Of course
some people said 
pronunciations belong on Wiktionary, but that's
beside the point: there 
are many articles where a discussion of the
pronunciation of certain 
words is necessary, and there ought to be a standard
way of notating that.
In fact, there is. The International Phonetic
Alphabet is ideally suited 
to marking pronunciations of words, and is flexible
enough to describe 
broad transcriptions that represent how a word is
pronounced in multiple 
dialects to minute phonetic details. This wisdom, of
course, has been 
lost on the makers of most American dictionaries,
who each insist upon 
using their own ad-hoc pronunciation scheme (one of
my personal pet 
peeves). The _Cambridge Dictionary of American
English_ is a notable, if 
perhaps not well-known, exception. The foremost
dictionary of (mostly) 
British English, the _Oxford English Dictionary_
uses IPA, as does the 
major Australian English dictionary,  _The Macquarie
Dictionary_.
But I digress. There are several pages on the
Wikipedia that deal 
specifically with pronunciations, for example [[List
of words of 
disputed pronunciation]]. And the way that the
pronunciations are listed 
on that page is the worst possible mix of ad-hoc
pronunciation schemes. 
In fact, some of the ad-hac pronunciations given I
couldn't even figure 
what they meant. (does AHSK rhyme with American
_task_ or _mosque_?). 
Clearly some kind of standard scheme is needed.
I spent several hours today revamping that page,
using IPA 
transcriptions and doing some serious research about
which 
pronunciations are listed in what dictionaries. I
put that page on 
[[List of words of disputed pronunciation/IPA]].
However, I later 
discovered to my tremendous dismay that the IPA
letters simply do not 
display in IE. The scheme for encoding IPA in ASCII,
called SAMPA, is 
capable of encoding anything in IPA, but it is not
particularly readable 
(although some might argue the same about IPA). It
was designed to be 
machine-readable, and it doesn't really seem like an
adequate solution. 
It uses lots of non-alphabetic characters to
represent sounds (the 'a' 
in _cat_ is '{' in SAMPA), and as a result
SAMPA-ized pronunciations are 
frankly ugly.
Anyhow, it seems that just using the HTML entities
for the Unicode IPA 
extensions is not an acceptable solution because it
leaves IE users with 
lovely but useless rectangles where there ought to
be IPA characters. 
There is a LaTeX extension called TIPA that allows
the complete set of 
IPA characters and diacritics. If this were
installed into the TeX math 
extensions, then a similar syntax could be used to
generate images of 
the IPA from LaTeX input.
I see the following possible solutions (in the order
that I think is good):
1.) Auto-detect the browser and send IPA Unicode to
browsers that 
support it and TIPA LaTeX images to those that
don't. (Pros: attractive 
display of IPA for all users. Cons: lots of 
programming)
2.) Just send TIPA LaTeX images (Pros: attractive
display of IPA. Cons: 
Uses images in text when for some users embedded IPA
Unicode would look 
better)
3.) Store the IPA in a special format or in a
special tag, auto-detect 
the browser and send IPA Unicode to browsers that
support it and SAMPA 
to the rest. (Pros: doesn't require inserting images
or using TeX. Cons: 
SAMPA is ugly and hard to read)
4.) Render IPA into GIFs or PNGs and just insert
them as images. (Pros: 
compatible with everything. Cons: time-consuming,
and difficult to change)
5.) Devise a Wikipedia-specific pronunciation scheme
and just use that 
(blech!) (Pros: no coding required. Cons: YAAHPS
(Yet Another Ad Hoc 
Pronunciation Scheme))
6.) Do nothing and continue to allow people to use
ad-hoc pronunciation 
schemes (BLECH!!) (Pros: no action required. Cons:
maintains status quo 
harms as described above)
Of course, no. 1 requires doing some coding and
testing for what may end 
up being a feature used on just a few pages. On the
other hand, such 
code could possibly be extremely useful for the
Wiktionary. In the 
meantime, I'm going to leave [[List of words of
disputed 
pronunciation/IPA]] as it is, and wait for
suggestions.
Now of course there will be opponents of the IPA,
because it's too 
technical or whatever reason. To those people I say
the IPA for the 
purposes of representing English is really no more
complicated than the 
pronunciation schemes used in American dictionaries,
like the 
_Merriam-Webster Dictionary_, and the _Cambridge
Dictionary of American 
English_, which is designed for learners of English,
seems to do just 
fine with it.

David [[User:Nohat]]

http://www.wikipedia.org/wiki/List_of_words_of_disputed_pronunciation/IPA*
...

WikiEN-l mailing list
WikiEN-l@Wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wikien-l
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [WikiEN-l] Pronunciations and IPA/SAMPA