Andrew Dunbar wrote:
--- Sabine Cretella sabine_cretella@yahoo.it wrote:
Sorry, this week is full of work and I won't have too much time to dedicate before the week end.
Referring to the posts of Ray and Gerard: my problem was and still is (but much less now) how to make e.g. multilingual ressources of my project on sourceforge.net available to wiktionary. At the moment we are talking about how to be able to pass translations of words easily from one wiktionary to the other as this part is the easiest of all.
I still think even this much is a bad idea. Copy & paste in every arena always makes it quicker and easier to make mistakes. The only time you can copy all the translations from word A in language X to word A' in language Y is when the sense and subtleties are *identical*. It will not work for "mouse" or "rat" into many languages. Just thinking about it for "moose" or "elk" is painful. Just because "mouse" in English can be translated into a/b/c/d in languages W/X/Y/Z, does not mean that "mus" in "language W" can also be translated into b/c/d in languages X/Y/Z. Every single one must be checked. Copying and pasting is the opposite of checking!
We are not talking about copy and paste; that would not be helpfull. When a meaning does not have a word or phrase that is the literal translation, there is no translation in that other language. This does not mean that this meaning cannot be defined. When translations are not the same, they are not. But this is not what is at issue, at issue is opening up data to both other wiktionaries and other electronic dictionaries.
But if you had a look at the tables Polyglot uploaded to the meta site you would have seen that these tables include punctuation, synonyms, opposites, definitions, part of speech and much, much more (to me it was quite overwhelming as we translators think in glossaries and not all those particular definitions for a term).
Can you point to these tables please?
The discussion and papers are on Meta.
So now we have quite a good method to copy and paste quite easily translations into the different wiktionaries, but does it make sense that a work needs to be repetead for every language again and again?
<> Yes. Because every language is different. If it wasn't we'd just use Babelfish and the results would always be perfect.
When the word "applepie" gets its entry in nl:wiktionary, would it not be fair to say that the en:wiktionary content for that word is relevant ? Would it not make sense to use all the content with some translations where needed?
<>Do we really have that much time to spend?
Whether or not we have the time to spend, a good quality dictionary requires time spent. If we choose not to spend that time we will not have a good quality dictionary. Ask any professional lexicographer. The OED, Websters, Le Grand Robert, you name it - all the big, quality dictionares took a long time. That is the nature of the game.
Yes, but contrary to Websters, OED, van Dale, we are not in the business of making money out of it. We are in the business of making open content and we can and should accept open content contributions from others. It is the aim of the wiktionary to have open content; and that is where we fail at this moment.
Up to some weeks it was not even possible to think about a copy and paste method and thanks Gerard it is there - so its one huge step ahead, but it is still far away from being "time friendly". Users (contributing users) should talk about concentrating efforts in order to have better results in less time. Now we are using the wiktionary just like the huge dictionary editors used to do years ago when every single word with all the relevant information was written on single sheets.
You will find they still do it this way. They use computers now but the work is still painstaking and precise and slow. Dictionary building is not a race.
When the en:wiktionary content were open, I would be able to re-use that content in nl:wiktionary at this stage we cannot. Am I to believe all this work is not relevant to nl:
The techinques are there to avoid double or in case of language translatios multiple work (how many wiktionaries are there? 20? 30? I did not check this out) so instead of one person needing one hour for a certain work this menas 30 hours of work to do the same job for 30 witkonaries.
<> I can guarantee if you avoid half the work you will double the errors.
What work are you talking about: does {{en}} not translate well between the wiktionaries ?
<>I agree, wiktionary is open content, but contributing people are working - and 30 hours are almost a week of work ... how much does this cost? where's the break even point for hours invested for programming and hours invested to let's say only 5000 terms? We are talking about computers and software of the second millennium not about the good old Zuse.
<> All you are doing is labouring a non-sequitor. Every- body agrees that time is both scarce and valuable. I do not agree that taking shortcuts and rushing result in better dictionaries. Anybody who has used a poor / cheap translating dictionary must know this already
<>Please don't become upset now: but we should try to make things with a certain criteria from the beginning on - it is much harder to do it afterwards when there's a lot of data inserted and the need is definitely there. To my experience wiktionary is going to be used by at least 90% of the people like a dictionary - to search a word in a certain language.
E.g. what disturbes a bit is that foreign language words are "seen" in the wiktionary. This is confusing. German should only give German terminology in its lists, English the English one, Dutch the Dutch one etc.
Woah there! Are you not aware that Wiktionary is a dictionary of all languages? It is both a definition dictionary and a translating dictionary. People whose first language is not English will be aided in their understanding of English definitions for English words when they can also see the glosses of that word in their own language.
Now I do believe that in the future we should have a way to "show only relevant information". Users should be able to choose whether they want to see the translations or not. At the moment the Wiktionary is small, the developers time is scarce, the structure is not robust. So this will happen later.
When I am on the English wiktionary and find "Deutsch" as a single page this is not logic to me - the term "Deutsch" should be linked to the German wiktionary instead and should not appear in the listing under the letter "D".
Then the user must be able to read German to find out what the word means! This is like suggesting an English-only dictionary and a German-only dictionary would be sufficient for a monolingual to translate between the two languages!
All of what you ask for is needed, but there are already plenty of glossaries that could be uploaded and I am sure, if we have the techniques many companies would agree to pass their glossaries with definitions to OpenContent.
The next thing is: I now can pass e.g. the colours list to Wiktonary, but I cannot retrieve an animals list with all its translations - so it's a one-way direction - to work with many other projects both directions are needed. So many of us who now concentrate on their own projects probably would say: I pass all my data to wiktionary as I get something back. Many of these listings, definitions etc. are from people in the language industry - and people of the language industry most of all search for the right term.
So: in any case I support the XML and database way - it will take some time to develop it, but once it is there wiktionary has all the characteristics needed to become the main reference tool for people working with languages and people being interested in languages. The more people can use it the more will contribute. Gerard, Polyglot and whoever is convinced about this way: please don't stop going. You are on the right way.
One "problem" I'm beginning to see for the future is that currently Wiktionary is both the lexicographer's index cards and the finished dictionary in one place. Especially as it grows, the "finished" articles are mixed right in with the under-research articles. I don't think it's a problem yet but it will be in a couple of years and I'm not sure what kind of solution might exist.
Andrew Dunbar (hippietrail)
When the data is in a databaseformat, it would become more obvious you would just be able te query for words without etymplogy or pronounciation and tacle them. This functionality will come with better structured content, this structure will result from the necessary work to enable import and export to the wiktionaries.
Now who's going to kill me ;-) ??
Ciao, Sabine
Sabine Cretella s.cretella@wordsandmore.it www.wordsandmore.it Meetingplace for translators www.wesolveitnet.com
Wiktionary-l mailing list Wiktionary-l@Wikipedia.org
Thanks, Gerard