I'm playing with CharInsert and edittools on EcoliWiki, and noted that greek letters are inserted as characters rather than as html glyphs, e.g.
α vs. α
I was thinking of switching to the glyphs because it seems like searching for alpha-something (common in chemistry and biochemistry) is easier than searching for α-something. Are there important reasons I should not do this?
Jim ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
Sigh.... my attempt to get Charinsert to insert html α etc instead of the character was a miserable failure. Putting α in charinsert still inserts a lovely greek letter, but it doesn't show up as α in the wikitext. I'd like it to be saved in the database as α so that searching for "alpha" finds it.
I'm guessing that I have to tweak something. Anyone else run into this?
Jim
On Aug 6, 2007, at 3:02 PM, Jim Hu wrote:
I'm playing with CharInsert and edittools on EcoliWiki, and noted that greek letters are inserted as characters rather than as html glyphs, e.g.
α vs. α
I was thinking of switching to the glyphs because it seems like searching for alpha-something (common in chemistry and biochemistry) is easier than searching for α-something. Are there important reasons I should not do this?
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
I think you'd better tweak the search. If you enter &alpha it'll find the alpha search, but it will print & alpha ; when you'll view your article.
I've modified the search process on my site. "Jovano Jovanke" is a well known song from Macedonia, pronounced iovano... (the "j" is pronounced "i"). So everybody write it a different way : ïovano, iovano, jovano... if you search for "iovano" you'll find "jovano" which is the right orthography. http://tousauxbalkans.jexiste.fr/index.php5?search=iovano&go=Consulter
the same with a lot of diacritics letters no Latin-1, and also cyrillic... "iovano" search also for "Йовано"... it's not yet perfect, not finished, it was hard work
2007/8/7, Jim Hu jimhu@tamu.edu:
Sigh.... my attempt to get Charinsert to insert html α etc instead of the character was a miserable failure. Putting α in charinsert still inserts a lovely greek letter, but it doesn't show up as α in the wikitext. I'd like it to be saved in the database as α so that searching for "alpha" finds it.
I'm guessing that I have to tweak something. Anyone else run into this?
Jim
On Aug 6, 2007, at 3:02 PM, Jim Hu wrote:
I'm playing with CharInsert and edittools on EcoliWiki, and noted that greek letters are inserted as characters rather than as html glyphs, e.g.
α vs. α
I was thinking of switching to the glyphs because it seems like searching for alpha-something (common in chemistry and biochemistry) is easier than searching for α-something. Are there important reasons I should not do this?
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
On Aug 7, 2007, at 1:48 AM, Sylvain Machefert wrote:
I think you'd better tweak the search. If you enter &alpha it'll find the alpha search, but it will print & alpha ; when you'll view your article.
I was reluctantly coming to the same conclusion.
I've modified the search process on my site. "Jovano Jovanke" is a well known song from Macedonia, pronounced iovano... (the "j" is pronounced "i"). So everybody write it a different way : ïovano, iovano, jovano... if you search for "iovano" you'll find "jovano" which is the right orthography. http://tousauxbalkans.jexiste.fr/index.php5?search=iovano&go=Consulter
the same with a lot of diacritics letters no Latin-1, and also cyrillic... "iovano" search also for "Йовано"... it's not yet perfect, not finished, it was hard work
What is your approach to tweaking the search? Do you modify the searchindex table itself, or do something to the query, or both? Since the searchindex has a minimum word length for searching, looking for single characters seems like it won't work. I was thinking that the way to go would be to transliterate (is that the right word) these when building the searchindex. I think my needs are less demanding than yours!
Jim
<snip>
Hello Here is my code with comments. Now I'm running MediaWiki 1.10.0, but this works for a long time.
In SpecialSearch.php, method showResults($term), just before the serious things $search = SearchEngine::create(); (~line 166) there is a condition which runs the hook SearchOverride. if (wfRunHooks('SearchOverride', array($this, $term))) {
I put the } just before the end ot the method } //endif run hook SearchOverride $wgOut->addHTML( $this->powerSearchBox( $term ) ); (~line 220) wfProfileOut( $fname ); } //end of showResults method
In extensions folder, create a extendedsearch.php, include it in LocalSettings like other extensions. SpecialSearch do 2 searches : in title and in text, and displays the results What I add : a 3rd search in text of similar words, no matching all (strict matching = false), and display the results.
Here is the code :
function wfExtendedSearchSetup() { global $wgHooks, $wgMessageCache; $wgHooks['SearchOverride'][] = 'extendedSearch'; $wgMessageCache->addMessage('othermatches', 'Mots ressemblants / Similar words'); }
function extendedSearch($specialSearchObj, $term) { global $wgOut, $wgRequest;
//Copied from includes/SpecialSearch.php
//Number of results $num = 0;
$search = SearchEngine::create(); $search->setLimitOffset( $specialSearchObj->limit, $specialSearchObj->offset ); $search->setNamespaces( $specialSearchObj->namespaces ); $search->showRedirects = $specialSearchObj->searchRedirects; $search->strictMatching = true; //this search in titles and texts, the exact term searched, like the regular SpecialSearch. $titleMatches = $search->searchTitle( $term ); $num += $titleMatches->numRows(); $textMatches = $search->searchText( $term ); $num += $textMatches->numRows();
//Now search for a list of words, no strict matching //$othersTerm is an array $search->strictMatching = false; //getOtherSearches is YOUR method which returns the list of words //e.g. $term = "alpha", returns array( "alpha", "α" ) $othersTerm = getOtherSearches($term); $othersMatches = $search->searchText(implode(' ', $othersTerm)); $num += $othersMatches->numRows();
//Now display results in the same way than the regular SpecialSearch if ( $num >= $specialSearchObj->limit ) { $top = wfShowingResults( $specialSearchObj->offset, $specialSearchObj->limit ); } else { $top = wfShowingResultsNum( $specialSearchObj->offset, $specialSearchObj->limit, $num ); } $wgOut->addHTML( "<p>{$top}</p>\n" );
if( $num || $specialSearchObj->offset ) { $prevnext = wfViewPrevNext( $specialSearchObj->offset, $specialSearchObj->limit, 'Special:Search', wfArrayToCGI( $specialSearchObj->powerSearchOptions(), array( 'search' => $term ) ) ); $wgOut->addHTML( "<br />{$prevnext}\n" ); }
//I don't remember what convertForSearchResult do, but I apply it to all //similar words + the searched $term global $wgContLang; //$tm = $wgContLang->convertForSearchResult( $search->termMatches() ); $othersTerm[] = $term; $tm = $wgContLang->convertForSearchResult( $othersTerm ); $terms = implode( '|', $tm );
//Display matches in title (from SpecialSearch) if( $titleMatches ) { if( $titleMatches->numRows() ) { $wgOut->addWikiText( '==' . wfMsg( 'titlematches' ) . "==\n" ); $wgOut->addHTML( $specialSearchObj->showMatches( $titleMatches, $terms ) ); } else { $wgOut->addWikiText( '==' . wfMsg( 'notitlematches' ) . "==\n" ); } }
//Display matches in text (from SpecialSearch) if( $textMatches ) { if( $textMatches->numRows() ) { $wgOut->addWikiText( '==' . wfMsg( 'textmatches' ) . "==\n" ); $wgOut->addHTML( $specialSearchObj->showMatches( $textMatches, $terms ) ); } elseif( $num == 0 ) { # Don't show the 'no text matches' if we received title matches $wgOut->addWikiText( '==' . wfMsg( 'notextmatches' ) . "==\n" ); } }
//Display matches similar words if ($othersMatches->numRows() || true/*for debugging*/) { $wgOut->addWikiText('=='.wfMsg('othermatches')."==\n" ); $wgOut->addHTML('<!-- Debug ('.count($othersTerm).') : ' . implode(', ', $othersTerm).' -->'); $wgOut->addHTML($specialSearchObj->showMatches($othersMatches, $terms)); }
//conclusion if ( $num == 0 ) { $wgOut->addWikiText( wfMsg( 'nonefound' ) ); } if( $num || $specialSearchObj->offset ) { $wgOut->addHTML( "<p>{$prevnext}</p>\n" ); }
return false; //it tells to wfRunHooks "don't execute what you have in the if runhooks //so the regular search in SpecialSearch is not executed, because I copied it here, it's already done. It's an "override" hook }
I hope the formatting will not be destroyed in email :-)
Hope this helps, Sylvain
2007/8/7, Jim Hu jimhu@tamu.edu:
On Aug 7, 2007, at 1:48 AM, Sylvain Machefert wrote:
<snip> I've modified the search process on my site. "Jovano Jovanke" is a well known song from Macedonia, pronounced iovano... (the "j" is pronounced "i"). So everybody write it a different way : ïovano, iovano, jovano... if you search for "iovano" you'll find "jovano" which is the right orthography. http://tousauxbalkans.jexiste.fr/index.php5?search=iovano&go=Consulter
the same with a lot of diacritics letters no Latin-1, and also cyrillic... "iovano" search also for "Йовано"... it's not yet perfect, not finished, it was hard work
What is your approach to tweaking the search? Do you modify the searchindex table itself, or do something to the query, or both? Since the searchindex has a minimum word length for searching, looking for single characters seems like it won't work. I was thinking that the way to go would be to transliterate (is that the right word) these when building the searchindex. I think my needs are less demanding than yours!
Jim
<snip>
mediawiki-l@lists.wikimedia.org