[Mediawiki-l] Greek letters: char vs glyph?

Sylvain Machefert iubito at gmail.com
Thu Aug 9 03:17:56 UTC 2007


Hello
Here is my code with comments. Now I'm running MediaWiki 1.10.0, but this
works for a long time.

In SpecialSearch.php, method showResults($term), just before the serious
things
$search = SearchEngine::create(); (~line 166)
there is a condition which runs the hook SearchOverride.
if (wfRunHooks('SearchOverride', array($this, $term))) {

I put the } just before the end ot the method
} //endif run hook SearchOverride
$wgOut->addHTML( $this->powerSearchBox( $term ) ); (~line 220)
wfProfileOut( $fname );
} //end of showResults method

In extensions folder, create a extendedsearch.php, include it in
LocalSettings like other extensions.
SpecialSearch do 2 searches : in title and in text, and displays the results
What I add : a 3rd search in text of similar words, no matching all (strict
matching = false), and display the results.

Here is the code :

function wfExtendedSearchSetup() {
   global $wgHooks, $wgMessageCache;
   $wgHooks['SearchOverride'][] = 'extendedSearch';
   $wgMessageCache->addMessage('othermatches', 'Mots ressemblants / Similar
words');
}

function extendedSearch($specialSearchObj, $term) {
   global $wgOut, $wgRequest;

   //Copied from includes/SpecialSearch.php

   //Number of results
   $num = 0;

   $search = SearchEngine::create();
   $search->setLimitOffset( $specialSearchObj->limit,
$specialSearchObj->offset );
   $search->setNamespaces( $specialSearchObj->namespaces );
   $search->showRedirects = $specialSearchObj->searchRedirects;
   $search->strictMatching = true;
   //this search in titles and texts, the exact term searched, like the
regular SpecialSearch.
   $titleMatches = $search->searchTitle( $term );
   $num += $titleMatches->numRows();
   $textMatches = $search->searchText( $term );
   $num += $textMatches->numRows();

   //Now search for a list of words, no strict matching
   //$othersTerm is an array
   $search->strictMatching = false;
   //getOtherSearches is YOUR method which returns the list of words
   //e.g. $term = "alpha", returns array( "alpha", "α" )
   $othersTerm = getOtherSearches($term);
   $othersMatches = $search->searchText(implode(' ', $othersTerm));
   $num += $othersMatches->numRows();

   //Now display results in the same way than the regular SpecialSearch
   if ( $num >= $specialSearchObj->limit ) {
      $top = wfShowingResults( $specialSearchObj->offset,
$specialSearchObj->limit );
   } else {
      $top = wfShowingResultsNum( $specialSearchObj->offset,
$specialSearchObj->limit, $num );
   }
   $wgOut->addHTML( "<p>{$top}</p>\n" );

   if( $num || $specialSearchObj->offset ) {
      $prevnext = wfViewPrevNext( $specialSearchObj->offset,
$specialSearchObj->limit,
         'Special:Search',
         wfArrayToCGI(
            $specialSearchObj->powerSearchOptions(),
            array( 'search' => $term ) ) );
      $wgOut->addHTML( "<br />{$prevnext}\n" );
   }

   //I don't remember what convertForSearchResult do, but I apply it to all
   //similar words + the searched $term
   global $wgContLang;
   //$tm = $wgContLang->convertForSearchResult( $search->termMatches() );
   $othersTerm[] = $term;
   $tm = $wgContLang->convertForSearchResult( $othersTerm );
   $terms = implode( '|', $tm );

   //Display matches in title (from SpecialSearch)
   if( $titleMatches ) {
      if( $titleMatches->numRows() ) {
         $wgOut->addWikiText( '==' . wfMsg( 'titlematches' ) . "==\n" );
         $wgOut->addHTML( $specialSearchObj->showMatches( $titleMatches,
$terms ) );
      } else {
         $wgOut->addWikiText( '==' . wfMsg( 'notitlematches' ) . "==\n" );
      }
   }

   //Display matches in text (from SpecialSearch)
   if( $textMatches ) {
      if( $textMatches->numRows() ) {
         $wgOut->addWikiText( '==' . wfMsg( 'textmatches' ) . "==\n" );
         $wgOut->addHTML( $specialSearchObj->showMatches( $textMatches,
$terms ) );
      } elseif( $num == 0 ) {
         # Don't show the 'no text matches' if we received title matches
         $wgOut->addWikiText( '==' . wfMsg( 'notextmatches' ) . "==\n" );
      }
   }

   //Display matches similar words
   if ($othersMatches->numRows() || true/*for debugging*/) {
      $wgOut->addWikiText('=='.wfMsg('othermatches')."==\n" );
      $wgOut->addHTML('<!-- Debug ('.count($othersTerm).') : '
         . implode(', ', $othersTerm).' -->');
      $wgOut->addHTML($specialSearchObj->showMatches($othersMatches,
$terms));
   }

   //conclusion
   if ( $num == 0 ) {
      $wgOut->addWikiText( wfMsg( 'nonefound' ) );
   }
   if( $num || $specialSearchObj->offset ) {
      $wgOut->addHTML( "<p>{$prevnext}</p>\n" );
   }

   return false; //it tells to wfRunHooks "don't execute what you have in
the if runhooks
   //so the regular search in SpecialSearch is not executed, because I
copied it here, it's already done. It's an "override" hook
}

I hope the formatting will not be destroyed in email :-)

Hope this helps,
Sylvain

2007/8/7, Jim Hu <jimhu at tamu.edu>:
>
> On Aug 7, 2007, at 1:48 AM, Sylvain Machefert wrote:
> ><snip>
> > I've modified the search process on my site.
> > "Jovano Jovanke" is a well known song from Macedonia, pronounced
> > iovano...
> > (the "j" is pronounced "i"). So everybody write it a different
> > way : ïovano,
> > iovano, jovano... if you search for "iovano" you'll find "jovano"
> > which is
> > the right orthography.
> > http://tousauxbalkans.jexiste.fr/index.php5?search=iovano&go=Consulter
> >
> > the same with a lot of diacritics letters no Latin-1, and also
> > cyrillic...
> > "iovano" search also for "Йовано"... it's not yet perfect, not
> > finished, it
> > was hard work
>
> What is your approach to tweaking the search?  Do you modify the
> searchindex table itself, or do something to the query, or both?
> Since the searchindex has a minimum word length for searching,
> looking for single characters seems like it won't work.  I was
> thinking that the way to go would be to transliterate (is that the
> right word) these when building the searchindex.  I think my needs
> are less demanding than yours!
>
> Jim
>
> <snip>
>

-- 
Sylvain Machefert
http://iubito.free.fr
http://tousauxbalkans.jexiste.fr


More information about the MediaWiki-l mailing list