Hello, Wikipedia's database is quite huge. But is not widening so fast. But it would be changed when all the Wikipedians started creating common database. The main problem is the difference of languages, but... I have an idea! :) I know my idea will not be so easy to realize, but I would be very usefull.
The idea is to create new language, based on most popular languages from all over the world. This language would not be a human language, but a language to store information.
Today we have some language translating applications, but they are not perfect, because two things: 1. Some languages differ too much 2. Some words have many meanings, and theprogram doesn't know which one shoulb be chosen. By creating new language we would solve first problem. (I think we do not have to create entirely new language, maybe modifying Esperanto would be just enough). The second problem could be chosen by listing all the meanings of words. For example for english language we could create file like this: word number word meaning -------------------------------- 1 mind intellect 2 mind thoughts 3 mind a head 4 mind to object to
The translating would look like this: I have written a sentence: "The study of logic trains the mind". Application scans my sentence and asks in which meaning I used word "mind". Then I choose from all "mind" meanings word "intellect". After explaining allthe meanings by the writer the application saves it in it's own language in a structure like this: 116117 6322 987672 1 312312 Where the numbers means word numbers.
Decompression would look like this: I have asked the program to display the message in Polish. The application loads file "polish.txt" and is looking for words with these numbers. As a fourth word it loads word from line one (because word "mind" with meaning "intellect" is in line 1 in all the languages, not only in English). It finds all the words and displays them.
I know that writing down all the meanings of words is not easy. But if all Wikipedians write just a few we would finish it very fast. The hardest thing is to make the language, that describes in which time is the sencence, what the order of words should be after translating to language X and what after diplaying in Y, etc. But I think this is possible and would make for eg. building the database of Wikipedia much easier. And not only this. There will be many applications for it.
Hope you understood what I mean. I know I may have made some mistakes (both gramatically and logically)...
So- how do you like my idea? Do you think it's worth realizing?
Regards, Talthen (talthen@wp.pl)
On Tue, 10 Feb 2004 talthen@wp.pl wrote:
[...] The idea is to create new language, based on most popular languages from all over the world. This language would not be a human language, but a language to store information. [...] just enough). The second problem could be chosen by listing all the meanings of words. For example for english language we could create file like this: word number word meaning
1 mind intellect 2 mind thoughts 3 mind a head 4 mind to object to
Sorry if I sound harsh, but human languages don't work this way at all. There are way too many words that you cannot translate in other languages, different structures, tenses, multiple meanings, etc.
If it was so easy, wouldn't you think that automatic translations would work much better? What you are proposing is basically automatic translation between any two languages, with a common step in the middle. To have this work in an acceptable fashion, your database needs to extend to cover all possible word combinations: basically, every sentence that could be uttered by anyone, complete of previous context. And not even a computer the size of the Universe will cope with that.
Alfio
On Tue, Feb 10, 2004 at 07:29:14PM +0100, talthen@wp.pl wrote:
Hello, Wikipedia's database is quite huge. But is not widening so fast. But it would be changed when all the Wikipedians started creating common database. The main problem is the difference of languages, but... I have an idea! :) I know my idea will not be so easy to realize, but I would be very usefull.
The idea is to create new language, based on most popular languages from all over the world. This language would not be a human language, but a language to store information.
Today we have some language translating applications, but they are not perfect, because two things:
- Some languages differ too much
- Some words have many meanings, and theprogram doesn't know which one
shoulb be chosen. By creating new language we would solve first problem. (I think we do not have to create entirely new language, maybe modifying Esperanto would be just enough). The second problem could be chosen by listing all the meanings of words. For example for english language we could create file like this: word number word meaning
1 mind intellect 2 mind thoughts 3 mind a head 4 mind to object to
The translating would look like this: I have written a sentence: "The study of logic trains the mind". Application scans my sentence and asks in which meaning I used word "mind". Then I choose from all "mind" meanings word "intellect". After explaining allthe meanings by the writer the application saves it in it's own language in a structure like this: 116117 6322 987672 1 312312 Where the numbers means word numbers.
Decompression would look like this: I have asked the program to display the message in Polish. The application loads file "polish.txt" and is looking for words with these numbers. As a fourth word it loads word from line one (because word "mind" with meaning "intellect" is in line 1 in all the languages, not only in English). It finds all the words and displays them.
I know that writing down all the meanings of words is not easy. But if all Wikipedians write just a few we would finish it very fast. The hardest thing is to make the language, that describes in which time is the sencence, what the order of words should be after translating to language X and what after diplaying in Y, etc. But I think this is possible and would make for eg. building the database of Wikipedia much easier. And not only this. There will be many applications for it.
Hope you understood what I mean. I know I may have made some mistakes (both gramatically and logically)...
So- how do you like my idea? Do you think it's worth realizing?
First, choose some small area of knowledge. It doesn't matter what would it be, but it must be non-trivial for the experiment to be any meaningful. Then, try to implement something that works with this area and just a few languages.
Natural Language Processing is one of the most difficult parts of the Computer Science, where lot of really promising ideas failed in practice. Obviously, we'd love to use anything that'd make our work easier, but it would be very hard to get something like the thing you describe working.
On Feb 10, 2004, at 1:29 PM, talthen@wp.pl wrote:
The idea is to create new language, based on most popular languages from all over the world. This language would not be a human language, but a language to store information.
Ah. After my own heart. But I doubt you'll have much luck. Still, if you're willing to go through with it, develop a proof-of-concept. Even if it fails, it'll still be a good experiment.
BTW, while you're at it, work out how you'd encode "After my own heart" to make sense in other languages. :)
Peter (Spikey)
-- ---<>--- -- A house without walls cannot fall. Help build the world's largest encyclopedia at Wikipedia.org -- ---<>--- --
Sean Barrett wrote:
The idea is to create new language, based on most popular languages from all over the world. This language would not be a human language, but a language to store information.
Time flies like an arrow; fruit flies like a banana.
Hmm! A Marxist analysis of the problem! :-)
Peter Jaros wrote:
On Feb 10, 2004, at 1:29 PM, talthen@wp.pl wrote:
The idea is to create new language, based on most popular languages from all over the world. This language would not be a human language, but a language to store information.
Ah. After my own heart. But I doubt you'll have much luck. Still, if you're willing to go through with it, develop a proof-of-concept. Even if it fails, it'll still be a good experiment.
BTW, while you're at it, work out how you'd encode "After my own heart" to make sense in other languages. :)
Peter (Spikey)
Well, we already have Lojban, which more or less is an attempt to do just that. ;-)
Gregory Pietsch
Ah. After my own heart. But I doubt you'll have much luck. Still, if you're willing to go through with it, develop a proof-of-concept. Even if it fails, it'll still be a good experiment.
Concept? List all the words and their meanings + list all the idioms with their meanings. Next step while translating to THE language writer will be asked which time it is, what he/she wanted to say with these words, what are the meanings etc. This should translate from imperfect human language to a perfect computer language with just a bit of help from human. Then translating from perfect language to any human language shouldn't be hard for program.
BTW, while you're at it, work out how you'd encode "After my own heart" to make sense in other languages. :)
Make a list of al English idioms and theirs equivalents in other languages. Forst the program will check if in the sentence to translate are any idioms from the list. If not then it will translate just the words.
I started writing the translation application, but I am not a good programmer, so I don't know when will I be able to show you smth that works...
Regards, Talthen
On Wed, Feb 11, 2004 at 01:34:30PM +0100, talthen@wp.pl wrote:
This should translate from imperfect human language to a perfect computer language with just a bit of help from human. Then translating from perfect language to any human language shouldn't be hard for program.
The idea that such "perfect language" is possible to create is very intuitive. Yet, nobody have made it so far, so it's probably much harder than it seems.
The idea that such "perfect language" is possible to create is very
intuitive.
Yet, nobody have made it so far, so it's probably much harder than it
seems. Sure, but have you ever heard about anybody who tried it as an Open project? Microsoft spend many years and cash on creating its Windows OSes, but Linux is so better... I think my idea may be realized, but I can't do this just by myself...
Regards, Talthen
talthen@wp.pl wrote:
Sure, but have you ever heard about anybody who tried it as an Open project? Microsoft spend many years and cash on creating its Windows OSes, but Linux is so better... I think my idea may be realized, but I can't do this just by myself...
Well, I'm not going to say it's in principle impossible, but I think you do need to realize just how hard it is. =]
There's been various attempts in various research fields to do something of that sort. Some branches of logic work on codifying knowledge in terms of formal logical systems; clearly standard logic is too restrictive, but there's plenty of attempts to overcome that. Many areas of automated translation attempt to construct at least partial intermediate abstract representations to assist. And many areas of linguistics attempt to come up with generic explanatory and classification frameworks, at least ones that are applicable to certain subsets of languages. There's lots of stuff out there, but the main thing it has in common is that it's all pretty far from perfect. Which is not to say that it's not possible to create something that's at least useful; it's just hard.
-Mark
Hi Talthen,
Another problem you may have is the ability of people to be able to tell your program exactly what their prose actually means. At least here in the UK, grammar is barely taught in schools anymore so relying on British people to be able to codify their writing correctly might not be possible. For example, I'd hazard a guess that a reasonable proportion of the population wouldn't even be able to define what a pronoun is, let alone tell you whether they were using the word 'me' as a direct or indirect pronoun (which your program would need to know).
I'm not trying to discourage you, I'd love for this to be achieved, just mentioning something else you might want to think about.
Andrew (Ams80)
----- Original Message ----- From: talthen@wp.pl To: wikipedia-l@Wikimedia.org Sent: Wednesday, February 11, 2004 1:33 PM Subject: Re: [Wikipedia-l] An idea. What do you think about this?
The idea that such "perfect language" is possible to create is very
intuitive.
Yet, nobody have made it so far, so it's probably much harder than it
seems. Sure, but have you ever heard about anybody who tried it as an Open
project?
Microsoft spend many years and cash on creating its Windows OSes, but
Linux
is so better... I think my idea may be realized, but I can't do this just by myself...
Regards, Talthen
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
"t" == talthen@wp.pl writes:
t> The idea is to create new language, based on most popular t> languages from all over the world. This language would not be a t> human language, but a language to store information.
[[en:pasigraphy]]
~ESP
wikipedia-l@lists.wikimedia.org