On 9/22/05, Jim Breen Jim.Breen@infotech.monash.edu.au wrote:
[Gerard Meijssen (Re: [Wiktionary-l] English orthographies) writes:]
Jim Breen wrote:
Jack & Naree wrote: >An American-English dictionary, and a (Commonwealth) English >Dictionary then. >Otherwise, it has to be all listed as seperate entries.
Um. What about words that are spelled differently within and between Commonwealth countries? Or words like accoutrement/accouterment which are spelled differently within the US?
Of course they shouldn't be separate entries, but Gerard's database design seems to be dicating that one.
The fact that words are spelled differently needs to be addressed in one way or the other.
Absolutely.
Even in old style wiktionary there needs to be something both at the accoutrement and the accouterment article in order to make them "findable".
Agreed.
The English Wiktionary nowadays frowns on the use of redirects so it is more substantial than that.
Yes, in fact it is the frowning on redirects that led me to looking at the UW proposals. I was looking at the Wiktionary structure to see if it would be a suitable environment for my Japanese-Multilingual dictionary database. I ran into a number of problems, one of which was the "no redirects" policy, and someone suggested I look at UW.
The "frowning on redirects" policy is largely due to the fact that we have many languages in one "namespace". When a particular English spelling variant or even a plural happens to coincide with the spelling of another word in another language then we have to have two pages anyway. This is not uncommon. We then decided it was better to try for some consistency rather than having some shared pages and some redirects. The other major issue was what to do when a dictionary is created for both the British (colour, centre) market and the American (color, center) market without us trying to force upon anyone which is the "standard" and which is the "variant", which redirects lead to.
When these distinctions are at the database level, they can be presented to the user by the software in any number of ways.
In itself there is no value added to the fact that it has its own record in the tables Expression and Word. The words can be connected through SynTrans to the same meaning. They can be related through Relation to say that they are alternative ways of spelling.
Consequently, there is nothing special in having both accoutrement and accouterment exist within the database. The thing that is relevant is that they are both shown to the user of the dictionary who looks up either Expression.
Absolutely.
When they are alternate spellings within the same Language, they will be seen as such. So as far as I am concerned, this seems to me to be much ado about nothing.
Provided: (a) the essential information (senses, POS, etymology, etc.) only has to be entered once, and remains the same for all the spelling and orthographical variants;
Sometimes some of these will be different. In British and the Commonwealth except Canada "tire" only means "become tired". In US and Canadian English it also means "tyre", the rubber ring on the outside of a wheel. But these are homonyms rather than senses though many non-lexography savvy people don't realise the difference.
(b) the user, on entering either form, gets the one collection of information which shows all the alternative forms of the word, then I really have no objection. I can't understand why they are in different database records, and in the case of my own JMdict (XML) they aren't, but then I don't use SynTrans, etc.
Basically it's an arbitrary database design issue. UW is going for more granularity. In this way it's probably more object-oriented since it breaks things down into more, smaller objects. There is nothing intrinsic right or wrong about either approach.
>Frankly I favour the first option, because to non-American-English >speakers, the American spellings are simply misspellings.
Well that's news to this non-American-English speaker 8-)} I don't regard them as "misspellings" at all. Just different.
I would regard all accepted words as Expressions. In order to know more about an expression, you have to add more information to enrich the experience. It will be for instance be possible to date the first accepted use of the later spelling.
That's useful (and very often difficult to establish).
The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e. everything apart from English, Japanese, French and a little Latin), but in general the spelling has little or nothing to do with the etymology.
Sometimes one spelling is definitely known to be derived from another and both remain in use in various places. For instance the Spanish word for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but when it was later borrwed into Spain itself it became "cacahuete". It would be a shame to not have a way to record such things in the cases we do know them.
Then again, this may not be of interest to you but it is there for those who find it of interest.
As a lexicographer I am always interested in etymology. I am a bit mystified by the view that it is somehow coupled to spelling. In the languages I know, spelling used to be highly fluid and individualistic, and has only recently been pinned down into recognized norms. In the case of English, the fact that there are two "schools" of spelling (which only affect a minority of words) is largely the result of the simplifications made and promulgated by one man: Noah Webster. Interesting indeed, but nothing to do with etymology.
Spelling may not be exactly part of the definition of "etymology" but it is a part of the broader subject of "word history" but the former term has more popularity and is often used to cover both, rightly or wrongly.
As regards Webster's simplifications, in fact I've read it's not so simple and that some spellings were concocted by Samuel Johnson. Some of those were continued by Webster and some he "reverted".
Andrew Dunbar (from Australia if it makes a difference)
Cheers
Jim
-- Jim Breen http://www.csse.monash.edu.au/~jwb/ Clayton School of Information Technology, Tel: +61 3 9905 9554 Monash University, VIC 3800, Australia Fax: +61 3 9905 5146 (Monash Provider No. 00008C) ???????@?????? _______________________________________________ Wiktionary-l mailing list Wiktionary-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wiktionary-l