Thomas Dalton wrote:
On 04/01/2008, Jack Eapen C wrote:
By this time I hv figured out something Justing playing around with mediawiki and mysql fulltext search- I hv the following function:
$wgHooks['ArticleSaveComplete'][] = 'getNormalTextfromWikiText';
function getNormalTextfromWikiText(&$article,&$user,&$text) { global $wgParser; $result = $wgParser->parse($text, $wgParser->mTitle, $wgParser->mOptions); $new_text= $result->getText();
$dbw =& wfGetDB( DB_MASTER ); $dbw->insert( 'searchable_text', array( 'page_id' => $article->getID(), 'searchable_text' => $new_text ) ); return true;
}
You probably want to change that to the place where the page is rendered. Doing it there a) You're parsing it twice (and parsing is expensive). b) Your table is not updated when the page changes without being edited (eg. templates).
Ah! I see. I was thinking about HTML tables and was completely confused! Would a simple regexp that removes everything between < and
do the trick? I've never really got the hang of regexps, so I won't
give you any code to try, but it should be a relatively easy one, I imagine.
You shouldn't use regex for HTML tags. Use the php function strip_tags()