[Muke Tever (Re: [Wiktionary-l] Re: English orthographies) writes:]
>>
>> Yes, and sometimes the etymology is ignored in the reckoning of correct spelling,
>> thus "island" and not "iland" and "thumb" rather than "thum". Etymology is never a
>> proof of spelling, though it is one of the factors that influences it.
In some cases, e.g. modern Japanese, the spelling, i.e. the kana
representation, has been aligned entirely with pronunciation, thus
blowing away many etymological influences. This was fought over for 50
years, but finally the radicals triumphed over the conservatives. The
old kana system was even more dysfunctional than English spelling.
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
[Andrew Dunbar ([Wiktionary-l] Re: English orthographies) writes:]
>>
>> I don't agree with the "authority" concept. Or maybe it's just the chosen name
>> for the concept I find unsettling.
Maybe the latter. While it can happily mean "the verifiable source that
the xxxx spelling exists", it can also be taken to mean some sanctioning
body.
>> I would have chosen "orthography" before
>> reading Jim's comments on orthography vs. spelling. For instance I would've
>> thought in the case of German that "the pre-1998 German orthography" would
>> be a valid concept. If I substituted the word "spelling" in this
>> phrase it sounds
>> like it refers to a specific word rather than the whole language.
>> Maybe "spelling
>> standard" works better for Jim?
It does. When I think of German orthography the image of pre-war Gothic
scripts vs modern Carolignian uncials (or whatever they are) comees to
mind, but I agree that things like o_umlaut/oe are probably more
orthography than spelling because they are operating across the board at
a language level rather than at a word variance level.
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
[GerardM (Re: [Wiktionary-l] English orthographies) writes:]
>> In the UW, an etymology is not related directly to a spelling. Typically an
>> etymology is linked to a "lemma". Lemmas are typically considered the
>> combination of a meaning and a word.
Well, a set of words, in that in English "sing", "sang" and "sung" are
the one lemma.
>> You wrote that you are interested in UW because of a multi language database
>> that you have particularly for/with Japanese. I do not what you want to
>> achieve, but if your interest is in an analysis of the possibility to import
>> your data in UW, then I would love to have a look at your data design. If
>> you consider importing the content under the GFDL, I would be even more
>> happy.
A good place to start is
http://www.csse.monash.edu.au/~jwb/j_jmdict.html There is an HTMLized
sample entry at:
http://www.csse.monash.edu.au/~jwb/jmdict_sample.html
>> One lesson that I will learn, is if I have all the features to include a
>> Japanese dictionary.
For a number of reasons, Japanese is a good language to test a
dictionary design against because it is very different animal to
a typical Indo-European language. Then you have to consider languages
like Chinese, or Arabic/Hebrew/etc. Semitic languages, as they bring in
other challenges.
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
[Andrew Dunbar ([Wiktionary-l] Re: English orthographies) writes:]
>> On 9/22/05, Jim Breen <Jim.Breen(a)infotech.monash.edu.au> wrote:
>> > Yes, in fact it is the frowning on redirects that led me to looking at
>> > the UW proposals. I was looking at the Wiktionary structure to see if it
>> > would be a suitable environment for my Japanese-Multilingual dictionary
>> > database. I ran into a number of problems, one of which was the "no
>> > redirects" policy, and someone suggested I look at UW.
>>
>> The "frowning on redirects" policy is largely due to the fact that we have many
>> languages in one "namespace". When a particular English spelling variant or
>> even a plural happens to coincide with the spelling of another word in another
>> language then we have to have two pages anyway. This is not uncommon.
>> We then decided it was better to try for some consistency rather than having
>> some shared pages and some redirects. The other major issue was what to
>> do when a dictionary is created for both the British (colour, centre) market
>> and the American (color, center) market without us trying to force upon anyone
>> which is the "standard" and which is the "variant", which redirects lead to.
This second problem goes away if a search for an entry can be made on more
than one "headword". In fact single headwords is a limitation of paper
dictionaries that never needed to be propagated into electronic
dictionaries.
>> > Provided:
>> > (a) the essential information (senses, POS, etymology, etc.) only has to
>> > be entered once, and remains the same for all the spelling and
>> > orthographical variants;
>>
>> Sometimes some of these will be different. In British and the Commonwealth
>> except Canada "tire" only means "become tired". In US and Canadian English
>> it also means "tyre", the rubber ring on the outside of a wheel. But these are
>> homonyms rather than senses though many non-lexography savvy people
>> don't realise the difference.
Of course they are homonyms. With a relatively small set of phonemes,
Japanese is riddled with homonyms; there are cases of more than 20 different
words with the same pronunciation. You'd go (and be) crazy if you tried
to treat them as the one "word".
>> > (b) the user, on entering either form, gets the one collection of
>> > information which shows all the alternative forms of the word, then
>> > I really have no objection. I can't understand why they are in different
>> > database records, and in the case of my own JMdict (XML) they aren't,
>> > but then I don't use SynTrans, etc.
>>
>> Basically it's an arbitrary database design issue. UW is going for more
>> granularity. In this way it's probably more object-oriented since it breaks
>> things down into more, smaller objects. There is nothing intrinsic right or
>> wrong about either approach.
Provided the design doesn't intrude into the operation (creation,
maintenance, lookup, etc.)
>> > Not really. I don't know about the languages I don't speak (i.e.
>> > everything apart from English, Japanese, French and a little Latin), but
>> > in general the spelling has little or nothing to do with the etymology.
>>
>> Sometimes one spelling is definitely known to be derived from another
>> and both remain in use in various places. For instance the Spanish word
>> for "peanut" was borrowed from Nahuatl in Mexico as "cacahuate" but
>> when it was later borrwed into Spain itself it became "cacahuete". It would
>> be a shame to not have a way to record such things in the cases we do
>> know them.
I was really referring to the centre/center, colour/color situations. I
should have said "minor spelling differences".
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
[Gerard Meijssen (Re: [Wiktionary-l] English orthographies) writes:]
>> Jim Breen wrote:
>> >>>Jack & Naree wrote:
>> >>>>An American-English dictionary, and a (Commonwealth) English
>> >>>>Dictionary then.
>> >>>>Otherwise, it has to be all listed as seperate entries.
>> >
>> >Um. What about words that are spelled differently within and
>> >between Commonwealth countries? Or words like accoutrement/accouterment
>> >which are spelled differently within the US?
>> >
>> >Of course they shouldn't be separate entries, but Gerard's
>> >database design seems to be dicating that one.
>> >
>> The fact that words are spelled differently needs to be addressed in one
>> way or the other.
Absolutely.
>> Even in old style wiktionary there needs to be
>> something both at the accoutrement and the accouterment article in order
>> to make them "findable".
Agreed.
>> The English Wiktionary nowadays frowns on the
>> use of redirects so it is more substantial than that.
Yes, in fact it is the frowning on redirects that led me to looking at
the UW proposals. I was looking at the Wiktionary structure to see if it
would be a suitable environment for my Japanese-Multilingual dictionary
database. I ran into a number of problems, one of which was the "no
redirects" policy, and someone suggested I look at UW.
>> In itself there is no value added to the fact that it has its own record
>> in the tables Expression and Word. The words can be connected through
>> SynTrans to the same meaning. They can be related through Relation to
>> say that they are alternative ways of spelling.
>>
>> Consequently, there is nothing special in having both accoutrement and
>> accouterment exist within the database. The thing that is relevant is
>> that they are both shown to the user of the dictionary who looks up
>> either Expression.
Absolutely.
>> When they are alternate spellings within the same
>> Language, they will be seen as such. So as far as I am concerned, this
>> seems to me to be much ado about nothing.
Provided:
(a) the essential information (senses, POS, etymology, etc.) only has to
be entered once, and remains the same for all the spelling and
orthographical variants;
(b) the user, on entering either form, gets the one collection of
information which shows all the alternative forms of the word, then
I really have no objection. I can't understand why they are in different
database records, and in the case of my own JMdict (XML) they aren't,
but then I don't use SynTrans, etc.
>> >>>>Frankly I favour the first option, because to non-American-English
>> >>>>speakers, the American spellings are simply misspellings.
>> >
>> >Well that's news to this non-American-English speaker 8-)} I
>> >don't regard them as "misspellings" at all. Just different.
>> I would regard all accepted words as Expressions. In order to know more
>> about an expression, you have to add more information to enrich the
>> experience. It will be for instance be possible to date the first
>> accepted use of the later spelling.
That's useful (and very often difficult to establish).
>> The etymology is also different.
Not really. I don't know about the languages I don't speak (i.e.
everything apart from English, Japanese, French and a little Latin), but
in general the spelling has little or nothing to do with the etymology.
>> Then again, this may not be of interest to you but it is there for those
>> who find it of interest.
As a lexicographer I am always interested in etymology. I am a bit
mystified by the view that it is somehow coupled to spelling. In the
languages I know, spelling used to be highly fluid and individualistic, and
has only recently been pinned down into recognized norms. In the case of
English, the fact that there are two "schools" of spelling (which only
affect a minority of words) is largely the result of the simplifications
made and promulgated by one man: Noah Webster. Interesting indeed, but
nothing to do with etymology.
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
[I missed this earlier posting]
>> Jack & Naree wrote:
>> > An American-English dictionary, and a (Commonwealth) English
>> > Dictionary then.
>> > Otherwise, it has to be all listed as seperate entries.
Um. What about words that are spelled differently within and
between Commonwealth countries? Or words like accoutrement/accouterment
which are spelled differently within the US?
Of course they shouldn't be separate entries, but Gerard's
database design seems to be dicating that one.
>> > Frankly I favour the first option, because to non-American-English
>> > speakers, the American spellings are simply misspellings.
Well that's news to this non-American-English speaker 8-)} I
don't regard them as "misspellings" at all. Just different.
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
Congratulations on reaching the 300th language :-)
Well some of you will say: but for some languages there are only some
words ... yes, but it is hard to find these "some words" in so many
languages :-)
It is a wonderful starting point to work on minor and rare languages.
So if you have terminology of minor and rare languages: don't wait - add
them to wiktionary :-)
Ciao, Sabine
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
Greetings,
[Gerard Meijssen (Re: [Wiktionary-l] English orthographies) writes:]
>> Jim Breen wrote:
>> >[Gerard Meijssen ([Wiktionary-l] English orthographies) writes:]
>> >>>1) English, American English and other orthographies are treated as
>> >>>seperate entities.
>> >
>> >I think this will be a disaster.
>> >
>> >Can you explain why "jewellery" and "jewelry" cannot be alternatives
>> >within the one entry?
>> In the database design,
>> http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design , an
>> Expression is a number of characters that make up a valid occurrence in
>> a language. Therefore every spelling IS a different Expression.
OK, so the short answer is that the UW database was designed that way.
I predict it will be a mess. It is also at variance with all the
lexicographical databases I have seen.
>> The English used in Britain, the United States, Australia etc is
>> significantly different.
Nonsense.
>> This can be found in the difference in
>> vocabulary and the difference in orthography.
Both the spelling and vocabulary differ only to a very small extent.
>> Typically when considering
>> spelling, the way the English, American, Australian spell differently
>> makes it a different orthography. This is reflected in there being
>> English, American etc dead wood dictionaries. In a project like UW where
>> we collect all words of all languages, it makes sense to reflect this.
By all means collect them and reflect them, but don't foster the
impression that they are a major issue, because they aren't. Also don't
fall into the trap of thinking that you can neatly compartmentalize
English spellings into strict country groups. Different mixes of spellings
are used right across the English-speaking world.
BTW, spelling and orthography are different things. Orthography refers
to the writing system, i.e. "a method of representing the sounds of a
language by written or printed symbols" (to quote Wordnet.) English
is written with one orthographical system.
To a large extent I don't really care that much about "jewellery" and
"jewelry" being in their own entries in English, because they are only a
few percent of words. Where this approach will be a total disaster is
with Japanese, where most words can be and are written in two or
more scripts, and where spelling variations are rife. The idea that the
meaning, POS, etc. etc. for a word will be replicated again and again
and again for each writing variant is too awful to contemplate.
To be blunt, it sounds like the UW database design was done with one
or a few languages in mind, and the others are being told to fall into
line.
Cheers
Jim
--
Jim Breen http://www.csse.monash.edu.au/~jwb/
Clayton School of Information Technology, Tel: +61 3 9905 9554
Monash University, VIC 3800, Australia Fax: +61 3 9905 5146
(Monash Provider No. 00008C) ジム・ブリーン@モナシュ大学
Jack & Naree wrote:
>
> Hoi,
> I would not consider either variation of English to be more or less
> important/relevant. What I consider is practical; how does it impact
> including this content in Ultimate Wiktionary.. Here we have a need to
> identify a word as either EE or AE or ?E and the question is how
> to do this.
>
>
> An American-English dictionary, and a (Commonwealth) English
> Dictionary then.
> Otherwise, it has to be all listed as seperate entries.
> Frankly I favour the first option, because to non-American-English
> speakers, the American spellings are simply misspellings.
>
> How does Dutch, Flemish, and Afrikaans approach this? Do you have a
> separate Flemish wiktionary etc...?
Dutch and Flemish are considered one language. I would not want a
seperate spell checker for either Dutch or Flemish. All Flemish words
are as far as I am concerned as good as any Dutch word. Afrikaans is a
seperate language and it should be truly be seen as such.
>
> It is up to the Wiktionary comunity how they want to have this.
> They can
> either have it with descriptions in definitions and etymologies
> spelled
> in one of the used orthographies or it can be considered not to be too
> important and it can be either.
> Thanks,
> GerardM
>
>