New subject: [Ticket#2010112810016598] I would like to provide a different search engine for Wikimedia

1 Dec 2010


      Dear Wikiteam,
  Guy Chapman requested that I post to the mailing list to ask how we can proceed to getting a copy of Wikipedia so that we can offer it as a database in our free search service, in response to the request in the following paragraph.  He made me aware of its size, but that is not an issue.  I would like to obtain a copy and then establish a routine for automated synced downloads like we do for the other databases we have in our system.
  I have had several requests to add Wikipedia to our eTBLAST text similarity search engine.  This is to improve reference finding as well as novelty assessment.  Our search tool is widely used, widely published and is free.  Please see etblast.org or http://en.wikipedia.org/wiki/ETBLAST.  I would like to create a searchable copy of Wikipedia locally with links back to Wikipedia for hits, and of course acknowledge Wikimedia.  We do this for several open text datasets and are prepared to keep a local, synced copy of Wikipedia, if you are interested.  I am certain that our mutual users would like and benefit from our working together.
Cheers, and thank you,
Skip
----- Original Message -----
From: "Wikipedia information team" info-en@wikimedia.org
To: "Skip Garner" garner@vbi.vt.edu
Cc: "Dominik L. Borkowski" dom@vbi.vt.edu, "Johnny Sun" szhaohui@vbi.vt.edu
Sent: Wednesday, December 1, 2010 9:43:25 AM
Subject: Re: [Ticket#2010112810016598] I would like to provide a different search engine for Wikimedia
Dear Skip Garner,
Thank you for your email.  Our response follows your message.
11/29/2010 16:23 - Skip Garner wrote:
...
Guy,
  Thank you for the information.  I would like to move forward on this, for I
think it will be of mutual value.  The size of the database is not an issue, and
we are always expanding our storage and serving capabilities.  We regularly work
with data in the 100's of T in size.  One issue would be getting the first copy,
but we could probably handle that by fed-x.
...
Can you tell me how we can proceed?
Cheers,
Skip
The best bet is probably to email the wikitech mailing list, which is where the
devs hang out.
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
They will have the best idea of the practicalities.
Yours sincerely,
Guy Chapman
-- 
Wikipedia - http://en.wikipedia.org
---
Disclaimer: all mail to this address is answered by volunteers, and responses are
not to be considered an official statement of the Wikimedia Foundation. For
official correspondence, please contact the Wikimedia Foundation by certified mail
at the address listed on http://www.wikimediafoundation.org


-- 
Harold "Skip" Garner
Executive Director
Virginia Bioinformatics Institute
Virginia Tech
Washington Street (0477)
Blacksburg, VA 24061
http://www.vbi.vt.edu

Phone: 540.231.2582
Fax: 540.231.1388

Assistant: Renee Nester
renee@vbi.vt.edu
540.231.2582

Re: [Wikitech-l] [Ticket#2010112810016598] I would like to provide a different search engine for Wikimedia