[Gerard Meijssen (Re: [Wiktionary-l] English orthographies) writes:]
Jim Breen wrote:
Jack & Naree wrote:
An American-English dictionary, and a (Commonwealth) English Dictionary then. Otherwise, it has to be all listed as seperate entries.
Um. What about words that are spelled differently within and between Commonwealth countries? Or words like accoutrement/accouterment which are spelled differently within the US?
Of course they shouldn't be separate entries, but Gerard's database design seems to be dicating that one.
The fact that words are spelled differently needs to be addressed in one way or the other.
Absolutely.
Even in old style wiktionary there needs to be something both at the accoutrement and the accouterment article in order to make them "findable".
Agreed.
The English Wiktionary nowadays frowns on the use of redirects so it is more substantial than that.
Yes, in fact it is the frowning on redirects that led me to looking at the UW proposals. I was looking at the Wiktionary structure to see if it would be a suitable environment for my Japanese-Multilingual dictionary database. I ran into a number of problems, one of which was the "no redirects" policy, and someone suggested I look at UW.
In itself there is no value added to the fact that it has its own record in the tables Expression and Word. The words can be connected through SynTrans to the same meaning. They can be related through Relation to say that they are alternative ways of spelling.
Consequently, there is nothing special in having both accoutrement and accouterment exist within the database. The thing that is relevant is that they are both shown to the user of the dictionary who looks up either Expression.
Absolutely.
When they are alternate spellings within the same Language, they will be seen as such. So as far as I am concerned, this seems to me to be much ado about nothing.
Provided: (a) the essential information (senses, POS, etymology, etc.) only has to be entered once, and remains the same for all the spelling and orthographical variants; (b) the user, on entering either form, gets the one collection of information which shows all the alternative forms of the word, then I really have no objection. I can't understand why they are in different database records, and in the case of my own JMdict (XML) they aren't, but then I don't use SynTrans, etc.
Frankly I favour the first option, because to non-American-English speakers, the American spellings are simply misspellings.
Well that's news to this non-American-English speaker 8-)} I don't regard them as "misspellings" at all. Just different.
I would regard all accepted words as Expressions. In order to know more about an expression, you have to add more information to enrich the experience. It will be for instance be possible to date the first accepted use of the later spelling.
That's useful (and very often difficult to establish).
The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e. everything apart from English, Japanese, French and a little Latin), but in general the spelling has little or nothing to do with the etymology.
Then again, this may not be of interest to you but it is there for those who find it of interest.
As a lexicographer I am always interested in etymology. I am a bit mystified by the view that it is somehow coupled to spelling. In the languages I know, spelling used to be highly fluid and individualistic, and has only recently been pinned down into recognized norms. In the case of English, the fact that there are two "schools" of spelling (which only affect a minority of words) is largely the result of the simplifications made and promulgated by one man: Noah Webster. Interesting indeed, but nothing to do with etymology.
Cheers
Jim
On 9/22/05, Jim Breen Jim.Breen@infotech.monash.edu.au wrote:
[Gerard Meijssen (Re: [Wiktionary-l] English orthographies) writes:]
Jim Breen wrote:
Jack & Naree wrote: >An American-English dictionary, and a (Commonwealth) English >Dictionary then. >Otherwise, it has to be all listed as seperate entries.
Um. What about words that are spelled differently within and between Commonwealth countries? Or words like accoutrement/accouterment which are spelled differently within the US?
Of course they shouldn't be separate entries, but Gerard's database design seems to be dicating that one.
The fact that words are spelled differently needs to be addressed in one way or the other.
Absolutely.
Even in old style wiktionary there needs to be something both at the accoutrement and the accouterment article in order to make them "findable".
Agreed.
The English Wiktionary nowadays frowns on the use of redirects so it is more substantial than that.
Yes, in fact it is the frowning on redirects that led me to looking at the UW proposals. I was looking at the Wiktionary structure to see if it would be a suitable environment for my Japanese-Multilingual dictionary database. I ran into a number of problems, one of which was the "no redirects" policy, and someone suggested I look at UW.
The "frowning on redirects" policy is largely due to the fact that we have many languages in one "namespace". When a particular English spelling variant or even a plural happens to coincide with the spelling of another word in another language then we have to have two pages anyway. This is not uncommon. We then decided it was better to try for some consistency rather than having some shared pages and some redirects. The other major issue was what to do when a dictionary is created for both the British (colour, centre) market and the American (color, center) market without us trying to force upon anyone which is the "standard" and which is the "variant", which redirects lead to.
When these distinctions are at the database level, they can be presented to the user by the software in any number of ways.
In itself there is no value added to the fact that it has its own record in the tables Expression and Word. The words can be connected through SynTrans to the same meaning. They can be related through Relation to say that they are alternative ways of spelling.
Consequently, there is nothing special in having both accoutrement and accouterment exist within the database. The thing that is relevant is that they are both shown to the user of the dictionary who looks up either Expression.
Absolutely.
When they are alternate spellings within the same Language, they will be seen as such. So as far as I am concerned, this seems to me to be much ado about nothing.
Provided: (a) the essential information (senses, POS, etymology, etc.) only has to be entered once, and remains the same for all the spelling and orthographical variants;
Sometimes some of these will be different. In British and the Commonwealth except Canada "tire" only means "become tired". In US and Canadian English it also means "tyre", the rubber ring on the outside of a wheel. But these are homonyms rather than senses though many non-lexography savvy people don't realise the difference.
(b) the user, on entering either form, gets the one collection of information which shows all the alternative forms of the word, then I really have no objection. I can't understand why they are in different database records, and in the case of my own JMdict (XML) they aren't, but then I don't use SynTrans, etc.
Basically it's an arbitrary database design issue. UW is going for more granularity. In this way it's probably more object-oriented since it breaks things down into more, smaller objects. There is nothing intrinsic right or wrong about either approach.
>Frankly I favour the first option, because to non-American-English >speakers, the American spellings are simply misspellings.
Well that's news to this non-American-English speaker 8-)} I don't regard them as "misspellings" at all. Just different.
I would regard all accepted words as Expressions. In order to know more about an expression, you have to add more information to enrich the experience. It will be for instance be possible to date the first accepted use of the later spelling.
That's useful (and very often difficult to establish).
The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e. everything apart from English, Japanese, French and a little Latin), but in general the spelling has little or nothing to do with the etymology.
Sometimes one spelling is definitely known to be derived from another and both remain in use in various places. For instance the Spanish word for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but when it was later borrwed into Spain itself it became "cacahuete". It would be a shame to not have a way to record such things in the cases we do know them.
Then again, this may not be of interest to you but it is there for those who find it of interest.
As a lexicographer I am always interested in etymology. I am a bit mystified by the view that it is somehow coupled to spelling. In the languages I know, spelling used to be highly fluid and individualistic, and has only recently been pinned down into recognized norms. In the case of English, the fact that there are two "schools" of spelling (which only affect a minority of words) is largely the result of the simplifications made and promulgated by one man: Noah Webster. Interesting indeed, but nothing to do with etymology.
Spelling may not be exactly part of the definition of "etymology" but it is a part of the broader subject of "word history" but the former term has more popularity and is often used to cover both, rightly or wrongly.
As regards Webster's simplifications, in fact I've read it's not so simple and that some spellings were concocted by Samuel Johnson. Some of those were continued by Webster and some he "reverted".
Andrew Dunbar (from Australia if it makes a difference)
Cheers
Jim
-- Jim Breen http://www.csse.monash.edu.au/~jwb/ Clayton School of Information Technology, Tel: +61 3 9905 9554 Monash University, VIC 3800, Australia Fax: +61 3 9905 5146 (Monash Provider No. 00008C) ???????@?????? _______________________________________________ Wiktionary-l mailing list Wiktionary-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wiktionary-l
Andrew Dunbar hippytrail@gmail.com wrote:
The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e. everything apart from English, Japanese, French and a little Latin), but in general the spelling has little or nothing to do with the etymology.
Sometimes one spelling is definitely known to be derived from another and both remain in use in various places. For instance the Spanish word for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but when it was later borrwed into Spain itself it became "cacahuete". It would be a shame to not have a way to record such things in the cases we do know them.
Indeed, but hopefully the area of the entry where spellings are given will be able to contain notes describing the who/what/where/why/when of the spelling-- tho thus far all I've heard about is the who, i.e. dialect and so-called authorities-- and it won't have to be lumped in with the etymology, whose job is to explain the etymon or etyma of a word and shouldn't have to touch on spelling (unless perhaps to explain why a certain spelling came to be, but even that could be handled by an annotation to the spelling itself).
*Muke!
On 9/22/05, Muke Tever muke@frath.net wrote:
Andrew Dunbar hippytrail@gmail.com wrote:
The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e. everything apart from English, Japanese, French and a little Latin), but in general the spelling has little or nothing to do with the etymology.
Sometimes one spelling is definitely known to be derived from another and both remain in use in various places. For instance the Spanish word for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but when it was later borrwed into Spain itself it became "cacahuete". It would be a shame to not have a way to record such things in the cases we do know them.
Indeed, but hopefully the area of the entry where spellings are given will be able to contain notes describing the who/what/where/why/when of the spelling--tho thus far all I've heard about is the who, i.e. dialect and so- called authorities--and it won't have to be lumped in with the etymology, whose job is to explain the etymon or etyma of a word and shouldn't have to touch on spelling (unless perhaps to explain why a certain spelling came to be, but even that could be handled by an annotation to the spelling itself).
I think either place for notation of spelling variants is valid but the proposed UW way makes the connection explicitly in the data rather than in the text of the notation. This means easier analysis by computer - which will be a good thing with a very large and structured dictionary.
I don't agree with the "authority" concept. Or maybe it's just the chosen name for the concept I find unsettling. I would have chosen "orthography" before reading Jim's comments on orthography vs. spelling. For instance I would've thought in the case of German that "the pre-1998 German orthography" would be a valid concept. If I substituted the word "spelling" in this phrase it sounds like it refers to a specific word rather than the whole language. Maybe "spelling standard" works better for Jim?
Another thing to think about is that changes in spelling happen for various reasons. The -our in English was inspired by the French of the time. But many others reflect things such as pronunciation changes, re-analysis of how the word was formed, or differing pronunciations in different communities. So while saying "spelling isn't directly related to etymology" is true, I think it's quite a bit less than the whole truth also.
Andrew Dunbar (hippietrail)
*Muke!
website: http://frath.net/ LiveJournal: http://kohath.livejournal.com/ deviantArt: http://kohath.deviantart.com/
FrathWiki, a conlang and conculture wiki: http://wiki.frath.net/ _______________________________________________ Wiktionary-l mailing list Wiktionary-l@Wikipedia.org http://mail.wikipedia.org/mailman/listinfo/wiktionary-l
Muke Tever wrote:
Andrew Dunbar hippytrail@gmail.com wrote:
The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e. everything apart from English, Japanese, French and a little Latin), but in general the spelling has little or nothing to do with the etymology.
Sometimes one spelling is definitely known to be derived from another and both remain in use in various places. For instance the Spanish word for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but when it was later borrwed into Spain itself it became "cacahuete". It would be a shame to not have a way to record such things in the cases we do know them.
Indeed, but hopefully the area of the entry where spellings are given will be able to contain notes describing the who/what/where/why/when of the spelling-- tho thus far all I've heard about is the who, i.e. dialect and so-called authorities-- and it won't have to be lumped in with the etymology, whose job is to explain the etymon or etyma of a word and shouldn't have to touch on spelling (unless perhaps to explain why a certain spelling came to be, but even that could be handled by an annotation to the spelling itself).
There are times when the etymology can be a guide to the spelling, especially when questions of double letters are involved. Thus "toroid" and not "torroid" or "millennium" rather than "millenium".
Ec
Ray Saintonge saintonge@telus.net wrote:
and it won't have to be lumped in with the etymology, whose job is to explain the etymon or etyma of a word and shouldn't have to touch on spelling (unless perhaps to explain why a certain spelling came to be, but even that could be handled by an annotation to the spelling itself).
There are times when the etymology can be a guide to the spelling, especially when questions of double letters are involved. Thus "toroid" and not "torroid" or "millennium" rather than "millenium".
Yes, and sometimes the etymology is ignored in the reckoning of correct spelling, thus "island" and not "iland" and "thumb" rather than "thum". Etymology is never a proof of spelling, though it is one of the factors that influences it.[1] That's why I wrote that the etymology might _explain_ a spelling--I originally wrote that etymology could be used to suggest the correctness of one valid, accepted, or otherwise usual spelling over another, but removed that when I realized that kind of statement, while common in other dictionaries, would be POV-pushing and have no place in any Wiktionary: if a language already has a standard, we note that (and not attempt to change it), and if there is no standard it is not our place to instate one.
*Muke! [1] Others being: pronunciation, tradition, analogy (whence 'thumb'), folk etymology (whence 'island'), aligning with or varying from the practices of other nations, orthographic practices, etc.--all of which IMO are reasons why the reasons for the spellings' annotations to go with the spellings instead of all over the page.
On 9/22/05, Jim Breen Jim.Breen@infotech.monash.edu.au wrote:
As a lexicographer I am always interested in etymology. I am a bit mystified by the view that it is somehow coupled to spelling. In the languages I know, spelling used to be highly fluid and individualistic, and has only recently been pinned down into recognized norms. In the case of English, the fact that there are two "schools" of spelling (which only affect a minority of words) is largely the result of the simplifications made and promulgated by one man: Noah Webster. Interesting indeed, but nothing to do with etymology.
In the UW, an etymology is not related directly to a spelling. Typically an etymology is linked to a "lemma". Lemmas are typically considered the combination of a meaning and a word. In the UW a Lemma can be considered a specific occurrence in the table SynTrans. This way you have the combination of a Word and a Meaning.
You wrote that you are interested in UW because of a multi language database that you have particularly for/with Japanese. I do not what you want to achieve, but if your interest is in an analysis of the possibility to import your data in UW, then I would love to have a look at your data design. If you consider importing the content under the GFDL, I would be even more happy.
One lesson that I will learn, is if I have all the features to include a Japanese dictionary.
Thanks, GerardM
wiktionary-l@lists.wikimedia.org