On 7/24/05, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi,
I had an interesting conversation with Brion. We do not agree on everything. One of the things we do not agree on are redirects.
In my opinion, Wiktionary should not have redirects. A word is either spelled correctly and it will have its lemma or it is not and there will not be a lemma with the incorrect spelling. In Brions opinion there are links to lemmas and as we need to ensure that these links remain ok, we need redirects to make this possible.
In a Wikipedia context I am 100% with Brion. In a Wiktionary context it is a different matter. As only correctly spelled words should be in a Wiktionary, errors should be deleted. Some of our Wiktionaries for historical reasons are capitalising their articles.
"Historical reasons" is surely not the only reason a Wiktionary uses first-character capitalisation, turning off first-character capitalisation is not the only way to achieve correctly spelled article titles, and having correctly spelled article titles has been denied as a reason for turning off first-character capitalisation by some.
Don't forget that capitalisation of the first letter is only one issue as regards correct orthography in article titles. Another thing to watch out for is the three variations for making English compounds: two words with a space, two words with hyphenation, and one compound word. For any term, any one, two or three of these variants may be considered correct.
Other test cases which have recently met strong resistence are Latin words correctly spelled in all capitals as was the only possible spelling while Latin was a living language, and using the correct non-ambiguous apostrophe character which has been widely available on home computers for 20 years.
Other languages also have optional or compatibility spellings: In French it is officially correct to indicate accents on capital letters but there is a de facto rule to leave them out. German specifically allows ä, ö, and ü to be spelled as ae, oe, and ue. In Switzerland there is no letter ß, the correct spelling to be ss instead. This is also spefically considered correct in the other German-speaking countries. Latin and Old English often have macrons to show long vowels and rarely have breves to show short vowels.
Old English and Middle English also had various fashions but no official spelling, with various exotic letters being used at different times and under various circumstances, resulting in varied spellings of many words. For example, ð and þ were mostly interchangeable.
Ancient Greek and Modern Greek have different accent marks which look quite similar but have different names and different places in Unicode. But the Modern Greek accents are still much more commen in Ancient Greek on the Internet.
Hebrew geresh is often represented by ASCII apostrophe, Hebrew gershayim is often represented by ASCII double quote, Hebrew maqaf is often represented by ASCII hyphen, Hawaiian okina is often represented by ASCII apostrophe, Turkish long vowels (actually more complicated than this) can be indicated by use of the circumflex accent according to the offical orthographical rules, Russian (and some other Cyrllic script languages) can optionally indicate where the stress is and in some contexts it is the norm. With Hebrew and most languages in Arabic script, all short vowels are optional as are a number of other "letters" such as dagesh, shada, sukun, and a host of more exotic ones.
Hebrew also has accents which only occur in religious works plus there are plene and defective spellings and both have vowels etc as optional extras on top.
In some Polynesian languages, it is macrons and glottal stops are optional, in others they are compulsory.
Chinese, Japanese, and Korean have written variants of many characters which have the same meaning and sound with all being correct. They also have variants which exist only due to computer encodings and quirks in how various fonts were designed.
For some languages different optional features of orthography can interact to from many combinations and permutations all of which are correct spellings.
There are surly quite a few more examples I haven't even become aware of yet.
In essence this means that from a spelling point of view the name of the lemmas are irrelevant. However, many people assume that the name of the article indicates that a word is spelled correctly. To remedy this, more and more wiktionaries are moving away from first character capitalisation and make it possible to have correctly spelled words as a lemma.
Or they are moving due to rhetoric like this email rather than for any good reason. Remember that in print dictionaries the norm is to include different meanings and parts of speech, and even derivatives - regardless of capitalisation - into one article or at least on the one page.
English Wiktionary still considers only first letter capitalisation and ASCII apostrophes and Russian without stress marks to be correct enough to be titles, in the last case even as redirects! (if this is the meaning of "lemma" you mean). What do other Wiktionaries do?
When a wiktionary has made this move away from first character capitalisation, the interwiki and interproject links within the Wikimedia projects need to be fixed. After this, the redirects can in my opinion be removed. I think this is appropriate because users expect that an application behaves in certain ways. When new content is added to a non-capitalised Wiktionary, the word foo will not have a redirect in Foo and consequently it behaves differently from the content predating the move to non-capitalisation. Also words like Kinder and kinder are not related at all.
Don't you mean that "not all words like Kinder and kinder are related"? This is almost the opposite meaning. Also many words are related. Even in German it is common for a noun and another part of speech to be intimately related and share an identical spelling apart from capitalisation.
The redirect at Kinder will be replaced at some stage breaking the existing redirect and consequently not providing the continuance that Brion holds dear.
For the Ultimate Wiktionary I have documented some of the design criteria. It can be found here: http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_decisions_on_its_usage The Data design can be found here: http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design
One crucial decision is that only correct spelling is allowed. This means that all incorrect spelling will be amended or deleted. As Ultimate Wiktionary is a database, it does not cater for things like redirects. I urge you to have a look at both the design criteria and the design itself because this is the time when it is relatively easy to make changes. Once Erik starts coding the UW database, having finished Wikidata and the GEMET implementation, the moment has passed us by.
Please list out of the above points what is and what is not considered a correct spelling as Ultimate Wiktionary is concerned. Please then indicate whether every correct spelling is also suitable as a headword/ article title/lemma or whatever you wish to call it.
Hippietrail.
Thanks, GerardM
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l