On 9/22/05, Jim Breen <Jim.Breen(a)infotech.monash.edu.au> wrote:
[Gerard Meijssen (Re: [Wiktionary-l] English
orthographies) writes:]
> Jim Breen wrote:
> >>>Jack & Naree wrote:
> >>>>An American-English dictionary, and a (Commonwealth) English
> >>>>Dictionary then.
> >>>>Otherwise, it has to be all listed as seperate entries.
> >
> >Um. What about words that are spelled differently within and
> >between Commonwealth countries? Or words like accoutrement/accouterment
> >which are spelled differently within the US?
> >
> >Of course they shouldn't be separate entries, but Gerard's
> >database design seems to be dicating that one.
> >
> The fact that words are spelled differently needs to be addressed in one
> way or the other.
Absolutely.
> Even in old style wiktionary there needs to
be
> something both at the accoutrement and the accouterment article in order
> to make them "findable".
Agreed.
> The English Wiktionary nowadays frowns on
the
> use of redirects so it is more substantial than that.
Yes, in fact it is the frowning on redirects that led me to looking at
the UW proposals. I was looking at the Wiktionary structure to see if it
would be a suitable environment for my Japanese-Multilingual dictionary
database. I ran into a number of problems, one of which was the "no
redirects" policy, and someone suggested I look at UW.
The "frowning on redirects" policy is largely due to the fact that we have many
languages in one "namespace". When a particular English spelling variant or
even a plural happens to coincide with the spelling of another word in another
language then we have to have two pages anyway. This is not uncommon.
We then decided it was better to try for some consistency rather than having
some shared pages and some redirects. The other major issue was what to
do when a dictionary is created for both the British (colour, centre) market
and the American (color, center) market without us trying to force upon anyone
which is the "standard" and which is the "variant", which redirects
lead to.
When these distinctions are at the database level, they can be presented to
the user by the software in any number of ways.
> In itself
there is no value added to the fact that it has its own record
> in the tables Expression and Word. The words can be connected through
> SynTrans to the same meaning. They can be related through Relation to
> say that they are alternative ways of spelling.
>
> Consequently, there is nothing special in having both accoutrement and
> accouterment exist within the database. The thing that is relevant is
> that they are both shown to the user of the dictionary who looks up
> either Expression.
Absolutely.
> When they are alternate spellings within the
same
> Language, they will be seen as such. So as far as I am concerned, this
> seems to me to be much ado about nothing.
Provided:
(a) the essential information (senses, POS, etymology, etc.) only has to
be entered once, and remains the same for all the spelling and
orthographical variants;
Sometimes some of these will be different. In British and the Commonwealth
except Canada "tire" only means "become tired". In US and Canadian
English
it also means "tyre", the rubber ring on the outside of a wheel. But these are
homonyms rather than senses though many non-lexography savvy people
don't realise the difference.
(b) the user, on entering either form, gets the one
collection of
information which shows all the alternative forms of the word, then
I really have no objection. I can't understand why they are in different
database records, and in the case of my own JMdict (XML) they aren't,
but then I don't use SynTrans, etc.
Basically it's an arbitrary database design issue. UW is going for more
granularity. In this way it's probably more object-oriented since it breaks
things down into more, smaller objects. There is nothing intrinsic right or
wrong about either approach.
>
>>>>Frankly I favour the first option, because to non-American-English
> >>>>speakers, the American spellings are simply misspellings.
> >
> >Well that's news to this non-American-English speaker 8-)} I
> >don't regard them as "misspellings" at all. Just different.
> I would regard all accepted words as
Expressions. In order to know more
> about an expression, you have to add more information to enrich the
> experience. It will be for instance be possible to date the first
> accepted use of the later spelling.
That's useful (and very often difficult to establish).
> The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e.
everything apart from English, Japanese, French and a little Latin), but
in general the spelling has little or nothing to do with the etymology.
Sometimes one spelling is definitely known to be derived from another
and both remain in use in various places. For instance the Spanish word
for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but
when it was later borrwed into Spain itself it became "cacahuete". It would
be a shame to not have a way to record such things in the cases we do
know them.
> Then
again, this may not be of interest to you but it is there for those
> who find it of interest.
As a lexicographer I am always interested in etymology. I am a bit
mystified by the view that it is somehow coupled to spelling. In the
languages I know, spelling used to be highly fluid and individualistic, and
has only recently been pinned down into recognized norms. In the case of
English, the fact that there are two "schools" of spelling (which only
affect a minority of words) is largely the result of the simplifications
made and promulgated by one man: Noah Webster. Interesting indeed, but
nothing to do with etymology.
Spelling may not be exactly part of the definition of "etymology" but it is a
part of the broader subject of "word history" but the former term has more
popularity and is often used to cover both, rightly or wrongly.
As regards Webster's simplifications, in fact I've read it's not so
simple and that
some spellings were concocted by Samuel Johnson. Some of those were
continued by Webster and some he "reverted".
Andrew Dunbar (from Australia if it makes a difference)
Cheers
Jim
--
Jim Breen
http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905
9554
Monash University, VIC 3800, Australia Fax: +61 3 9905
5146
(Monash Provider No. 00008C) ???????@??????
_______________________________________________
Wiktionary-l mailing list
Wiktionary-l(a)Wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wiktionary-l
--
http://linguaphile.sf.net