Re: [WikiEN-l] Pronunciations and IPA/SAMPA

5 Sep 2003

This is probably the most well thought out addressing
of this issue ever done on wp. I must say this is
impressive and inline with the consensus of

No unicode IPA on IE?? Hmm. Well, considering the
expensive workarounds you listed -- as necessary to
accomodate IE users -- for a fix that entirely in
Microsoft's domain, I would lean toward calling the
IPA Unicode as "standard" anyway, and let the ?? or
Xboxes be the problem of the IE end user. This is
already the case for any character sets that arent
loaded up anyway -- (I have yet to load a Hindi
character set for example. ;)  Soon afterward someone
will write a hack to accomodate IE no doubt, but
theres no reason not to push the Unicode IPA as the
standard right now.

But that still doesnt deal with the problem of easy
input via a Roman character set.  A little conversion
hack from the pseudovalues (/s/) to their IPA
equvalents should be a first priority , and I would do
it myself if I had the time, or could program a little
better (late bloomer ok..)

As always with apologies to the hackers,
-S-

--- David Friedland &lt;david(a)nohat.net&gt; wrote:
...
  There was some talk a while back about deciding on a
 standard method of 
 indicating pronunciations on Wikipedia. Of course
 some people said 
 pronunciations belong on Wiktionary, but that's
 beside the point: there 
 are many articles where a discussion of the
 pronunciation of certain 
 words is necessary, and there ought to be a standard
 way of notating that.

 In fact, there is. The International Phonetic
 Alphabet is ideally suited 
 to marking pronunciations of words, and is flexible
 enough to describe 
 broad transcriptions that represent how a word is
 pronounced in multiple 
 dialects to minute phonetic details. This wisdom, of
 course, has been 
 lost on the makers of most American dictionaries,
 who each insist upon 
 using their own ad-hoc pronunciation scheme (one of
 my personal pet 
 peeves). The _Cambridge Dictionary of American
 English_ is a notable, if 
 perhaps not well-known, exception. The foremost
 dictionary of (mostly) 
 British English, the _Oxford English Dictionary_
 uses IPA, as does the 
 major Australian English dictionary,  _The Macquarie
 Dictionary_.

 But I digress. There are several pages on the
 Wikipedia that deal 
 specifically with pronunciations, for example [[List
 of words of 
 disputed pronunciation]]. And the way that the
 pronunciations are listed 
 on that page is the worst possible mix of ad-hoc
 pronunciation schemes. 
 In fact, some of the ad-hac pronunciations given I
 couldn't even figure 
 what they meant. (does AHSK rhyme with American
 _task_ or _mosque_?). 
 Clearly some kind of standard scheme is needed.

 I spent several hours today revamping that page,
 using IPA 
 transcriptions and doing some serious research about
 which 
 pronunciations are listed in what dictionaries. I
 put that page on 
 [[List of words of disputed pronunciation/IPA]].
 However, I later 
 discovered to my tremendous dismay that the IPA
 letters simply do not 
 display in IE. The scheme for encoding IPA in ASCII,
 called SAMPA, is 
 capable of encoding anything in IPA, but it is not
 particularly readable 
 (although some might argue the same about IPA). It
 was designed to be 
 machine-readable, and it doesn't really seem like an
 adequate solution. 
 It uses lots of non-alphabetic characters to
 represent sounds (the 'a' 
 in _cat_ is '{' in SAMPA), and as a result
 SAMPA-ized pronunciations are 
 frankly ugly.

 Anyhow, it seems that just using the HTML entities
 for the Unicode IPA 
 extensions is not an acceptable solution because it
 leaves IE users with 
 lovely but useless rectangles where there ought to
 be IPA characters. 
 There is a LaTeX extension called TIPA that allows
 the complete set of 
 IPA characters and diacritics. If this were
 installed into the TeX math 
 extensions, then a similar syntax could be used to
 generate images of 
 the IPA from LaTeX input.
 I see the following possible solutions (in the order
 that I think is good):

 1.) Auto-detect the browser and send IPA Unicode to
 browsers that 
 support it and TIPA LaTeX images to those that
 don't. (Pros: attractive 
 display of IPA for all users. Cons: lots of 
 programming)

 2.) Just send TIPA LaTeX images (Pros: attractive
 display of IPA. Cons: 
 Uses images in text when for some users embedded IPA
 Unicode would look 
 better)

 3.) Store the IPA in a special format or in a
 special tag, auto-detect 
 the browser and send IPA Unicode to browsers that
 support it and SAMPA 
 to the rest. (Pros: doesn't require inserting images
 or using TeX. Cons: 
 SAMPA is ugly and hard to read)

 4.) Render IPA into GIFs or PNGs and just insert
 them as images. (Pros: 
 compatible with everything. Cons: time-consuming,
 and difficult to change)

 5.) Devise a Wikipedia-specific pronunciation scheme
 and just use that 
 (blech!) (Pros: no coding required. Cons: YAAHPS
 (Yet Another Ad Hoc 
 Pronunciation Scheme))

 6.) Do nothing and continue to allow people to use
 ad-hoc pronunciation 
 schemes (BLECH!!) (Pros: no action required. Cons:
 maintains status quo 
 harms as described above)

 Of course, no. 1 requires doing some coding and
 testing for what may end 
 up being a feature used on just a few pages. On the
 other hand, such 
 code could possibly be extremely useful for the
 Wiktionary. In the 
 meantime, I'm going to leave [[List of words of
 disputed 
 pronunciation/IPA]] as it is, and wait for
 suggestions.

 Now of course there will be opponents of the IPA,
 because it's too 
 technical or whatever reason. To those people I say
 the IPA for the 
 purposes of representing English is really no more
 complicated than the 
 pronunciation schemes used in American dictionaries,
 like the 
 _Merriam-Webster Dictionary_, and the _Cambridge
 Dictionary of American 
 English_, which is designed for learners of English,
 seems to do just 
 fine with it.

 - David [[User:Nohat]]

 * 
 <http://www.wikipedia.org/wiki/List_of_words_of_disputed_pronunciation/IPA>*
...

 _______________________________________________
 WikiEN-l mailing list
 WikiEN-l(a)Wikipedia.org
 http://mail.wikipedia.org/mailman/listinfo/wikien-l 

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [WikiEN-l] Pronunciations and IPA/SAMPA