Re: [Wikitech-l] [MediaWiki-CVS] SVN: [30390] branches/LuceneSearch-ajax/

2 Feb 2008


      On Feb 1, 2008 9:16 PM, Dan Thomas geoobject@gmail.com wrote:
...
Brion -- have you considered using SOLR, which extends Lucene?   An
enterprise-class search engine, v1.3 is nearing release and in
addition to XML and text, supports search inside rich documents
including MS Office and PDF.
http://lucene.apache.org/solr/
SOLR is a great wrapper around lucene, however I believe its focus is
different from what we need - my impression is that its main goal is to
provide an easy and powerful interface for what lucene already does, with
enhancement relevant for enterprise applications (e.g flexible schema
structure). It doesn't address almost any of issues we are having :
1) solr doesn't support distributed searching and split indexes (this is
however being worked on, afaik) - this is crucial since our indexes are just
too big to be on a single host
2) there is no advanced scoring scheme, for instance using backlinks, etc..
i.e. it offers same as lucene.
3) the default spellchecker is the one lucene uses, i.e. with per-word
suggestions - works fine on small data sets but gives pretty bad suggestions
on large ones
4) uses the highlighting that splits text into equal-size chunks not trying
to look at sentence boundaries and also doesn't support highlighting
matching phrases. I'm not sure how efficient its text storage is since we
have a huge amount of text..
5) no integrated prefix search for ajax suggestions
6) no parser for wiki syntax (although there is a wiki-parser being
developed for lucene)
...
On Feb 1, 2008 2:20 PM, Brion Vibber brion@wikimedia.org wrote:
...
Instead, new front-end code should be in the Special:Search front-end in
core, with a back-end plugin to talk to the Lucene server (the MWSearch
extension, possibly a bit out of date.)
<nod>
r.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] [MediaWiki-CVS] SVN: [30390] branches/LuceneSearch-ajax/