Hi Jose,
Glad you finding something to use the Wikipedia fore. A quick heads up -
the text is not "copyright-free" in the sense of public domain but does
have a "free" licence so you should be able to do what you want for no
cost and without extra permissions as long as you attribute us when
publishing. See
http://www.wikipedia.org/wiki/Wikipedia:Copyrights for
the full details.
Pete
-----Original Message-----
From: wikitech-l-bounces(a)Wikipedia.org
[mailto:wikitech-l-bounces@Wikipedia.org] On Behalf Of Jose Quesada
Sent: 27 November 2003 22:36
To: wikitech-l(a)wikipedia.org
Subject: [Wikitech-l] Wikipedia full dump (English) broken link?
Hi,
Here at CU we work with corpora of text to train models that
'understand'
language (see, e.g.,
LSA.colorado.edu). We wanted to use Wikipedia to
create a copyright-free corpus of text that anyone in the scientific
community could use. To do that we downloaded the DB dumps a while ago
( about 2 billion words), but due to a computer problem, we lost them.
I have noticed that the link to the full english database (2280MB):
http://download.wikipedia.org/archives/en/20031125_old_table.sql.bz2
doesn't work anymore; it returns a Forbidden error, says that you don't
have permission to access
/archives/en/20031125_old_table.sql.bz2 on this server
Could you please grant us access to the file?
Thanks a lot in advance,
-Jose
--
Jose Quesada, PhD.
quesadaj(a)psych.colorado.edu Research associate
http://lsa.colorado.edu/~quesadaj Institute of Cognitive Science
University of Colorado (Boulder)
Muenzinger psychology building Phone:303 492 1522
office D447A Fax: 303 492
7177
Campus Box 344
University of Colorado at Boulder
Boulder, CO 80309-0344
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)Wikipedia.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l