Re: [Wikipedia-l] An idea. What do you think about this?

10 Feb 2004

      On Tue, Feb 10, 2004 at 07:29:14PM +0100, talthen@wp.pl wrote:
...
Hello,
Wikipedia's database is quite huge. But is not widening so fast. But it
would be changed when all the Wikipedians started creating common database.
The main problem is the difference of languages, but... I have an idea! :) I
know my idea will not be so easy to realize, but I would be very usefull.
The idea is to create new language, based on most popular languages from all
over the world. This language would not be a human language, but a language
to store information.
Today we have some language translating applications, but they are not
perfect, because two things:

Some languages differ too much
Some words have many meanings, and theprogram doesn't know which one

shoulb be chosen.
By creating new language we would solve first problem. (I think we do not
have to create entirely new language, maybe modifying Esperanto would be
just enough). The second problem could be chosen by listing all the meanings
of words. For example for english language we could create file like this:
word number    word    meaning

1                       mind    intellect
2                       mind    thoughts
3                       mind    a head
4                       mind    to object to
The translating would look like this:
I have written a sentence: "The study of logic trains the mind". Application
scans my sentence and asks in which meaning I used word "mind". Then I
choose from all "mind" meanings word "intellect". After explaining allthe
meanings by the writer the application saves it in it's own language in a
structure like this:
116117 6322 987672 1 312312
Where the numbers means word numbers.
Decompression would look like this:
I have asked the program to display the message in Polish. The application
loads file "polish.txt" and is looking for words with these numbers.
As a fourth word it loads word from line one (because word "mind" with
meaning "intellect" is in line 1 in all the languages, not only in English).
It finds all the words and displays them.
I know that writing down all the meanings of words is not easy. But if all
Wikipedians write just a few we would finish it very fast.
The hardest thing is to make the language, that describes in which time is
the sencence, what the order of words should be after translating to
language X and what after diplaying in Y, etc.
But I think this is possible and would make for eg. building the database of
Wikipedia much easier.
And not only this. There will be many applications for it.
Hope you understood what I mean. I know I may have made some mistakes (both
gramatically and logically)...
So- how do you like my idea? Do you think it's worth realizing?
First, choose some small area of knowledge. It doesn't matter what would it be,
but it must be non-trivial for the experiment to be any meaningful.
Then, try to implement something that works with this area and just a few languages.
Natural Language Processing is one of the most difficult parts of the Computer
Science, where lot of really promising ideas failed in practice.
Obviously, we'd love to use anything that'd make our work easier,
but it would be very hard to get something like the thing you describe working.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [Wikipedia-l] An idea. What do you think about this?