Brion -- have you considered using SOLR, which extends Lucene? An enterprise-class search engine, v1.3 is nearing release and in addition to XML and text, supports search inside rich documents including MS Office and PDF.
http://lucene.apache.org/solr/
Dan
On Feb 1, 2008 2:20 PM, Brion Vibber brion@wikimedia.org wrote:
rainman@svn.wikimedia.org wrote:
Revision: 30390 Author: rainman Date: 2008-02-01 13:17:38 +0000 (Fri, 01 Feb 2008)
Log Message:
A new branch for LuceneSearch extension for the new daemon: will add ajax search and make some minor interface improvements.
Just a note -- I would recommend strongly against doing continued development on the old LuceneSearch front-end extension, as it's a maintenance nightmare.
Instead, new front-end code should be in the Special:Search front-end in core, with a back-end plugin to talk to the Lucene server (the MWSearch extension, possibly a bit out of date.)
-- brion vibber (brion @ wikimedia.org)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l