FW: "Auto Extraction from WWW" - Wikitech-l

20 Oct 2003

Could someone take over this question? I don't know what to tell these
guys.

Thanks.

Ed Poor

-----Original Message-----
From: Don Rameez [mailto:rameezdon@hotmail.com] 
Sent: Saturday, October 18, 2003 3:00 PM
To: Poor, Edmund W
Subject: RE: "Auto Extraction from WWW"

Dear Edmund; 

thanx for ur reply
i have installed MySql on Windows & i did try to open  the SQL dump but
somehow was unable to do so
could u plz guide me regarding the same (i m a novice as far as Mysql is
concerned)
hope to hear from u soon

regards

Don Rameez

-

-------Original Message-------

From: Poor, Edmund W <mailto:Edmund.W.Poor@abc.com> 
Date: Wednesday, October 15, 2003 01:06:19 AM
To: Don Rameez <mailto:rameezdon@hotmail.com> 
Subject: RE: "Auto Extraction from WWW"

I don't know how to convert SQL tables into plain TEXT. Why not use a
SELECT statement? Like:

SELECT cur_text FROM cur

Ed Poor

	-----Original Message-----
	From: Don Rameez [mailto:rameezdon@hotmail.com] 
	Sent: Sunday, October 12, 2003 9:33 PM
	To: Poor, Edmund W
	Subject: RE: "Auto Extraction from WWW"

Dear Edmund,

thanx for acknowledging soon &  answering to my queries
i appreciate ur concern for the Knowledge base

i would like to ask u one more query ....

 Q) We now have the SQL dump,  apart from MySQL is there any possibility
that we can
      access the data in some other format ( say plain TEXT ....)  

regards

Don Rameez

-------Original Message-------

From: Poor, Edmund W <mailto:Edmund.W.Poor@abc.com> 
Date: Tuesday, October 14, 2003 11:27:37 AM
To: Don Rameez <mailto:rameezdon@hotmail.com> 
Subject: RE: "Auto Extraction from WWW"

Your questions are best put to our senior developers, but I'll give you
some preliminary answers.

1. All our articles are stored as plain English text. There is a bit of
markup used for links.
2. We are not encouraging direct server-to-server links. Rather, we
invite users to edit articles via the web interface.
3. You can get a SQL dump, if you want the entire database. It's much
less than one GB in size, and could possibly fit on one CD (we are
planning to publish a CD eventually).

The difference between our project and yours is that we are a
non-encoded encyclopedia. We just have a collection of articles. 

You are trying to "encode" knowledge, which is Very Difficult. Many
attempts have been made in the past; I can't think of a single success,
but I can think of half a dozen spectacular failures. It's harder than
it looks!

I applaud the attempt, but this task involves artificial intelligence
(AI), and AI has not progressed beyond the so-called "expert system" or
"neural net". These are the toys of AI and have not produced reliable,
comprehensive results. 

What do you really hope to accomplish, in the next 5 to 10 years?

Sincerely,

Ed Poor
Developer & Sysop
Wikipedia

	-----Original Message-----
	From: Don Rameez [mailto:rameezdon@hotmail.com] 
	Sent: Saturday, October 11, 2003 11:35 PM
	To: JeLuF(a)gmx.de; ts4294967296(a)hotmail.com;
maveric149(a)yahoo.com; Poor, Edmund W; wikitech-l(a)Wikipedia.org;
JeLuF(a)gmx.de; ts4294967296(a)hotmail.com; maveric149(a)yahoo.com; Poor,
Edmund W; wikitech-l(a)Wikipedia.org
	Cc: nagarjun(a)hbcse.tifr.res.in
	Subject: "Auto Extraction from WWW"

Dear Sir,
  We are a group of 3 students currently pursuing our B.E - IT
(Bachelor of Engg. Information Technology)from the Mumbai University,
INDIA. 
As of now we are working on a project titled " AUTO EXTRACTION OF
CONTENTS FROM THE WORLD WIDE WEB" as a part of our BE project, in the
renowned institute of HBCSE-TIFR 
( Homi Bhabha Center for Science Education - Tata Institute of
Fundamental Research)
under the guidance of Scientist Dr.Nagarjuna.G.

 Our project is based on :
    OS      - GNU/LINUX 
    Language     - Python
    Server       - Zope
    Application  - GNOWSYS

GNOWSYS, Gnowledge Networking and Organizing System, is a web
application for developing 
and maintaining semantic web content developed in Python and works as an
installed product in Zope.  Our project involves automatically
extracting data from the (WWW) World Wide Web) &
use GNOWSYS for handling this vast amount of data. This will not only
help us 
store data in the Gnowledge base in form of meaningful relationships but
also see its
handling of huge amount of data.
The URL for our site is http://www.gnowledge.org

With this regards we could think no one but Wikipedia, which in itself
is a phenomenon.

We would be glad if u could answer to few of our queries :

1] What is the format in which the data is stored in Wikipedia ???
2] Apart from http or ftp are there any other specific protocols that
are in use, 
   which will be required to communicate to the Wikipedia Server ???
3] How can we utilize the SQL dump ???

We hope you will answer our queries at the earliest
With warm regards

                           Thanking You

      [ Rameez Don , Jaymin Darbari, Ulhas Dhuri ]

____________________________________________________
 <http://www.incredimail.com/redir.asp?ad_id=309&lang=9>   IncrediMail -
Email has finally evolved - Click Here
<http://www.incredimail.com/redir.asp?ad_id=309&lang=9>