SearchEngine subclasses can implement getTextFromContent() if they want to override the normal text fetching behavior.
I can't put it into SearchEngine subclass because Tika isn't a search engine, it's rather a java application that runs separately and extracts text from binary files like *.doc, *.pdf and so on.
TikaMW is a plugin that should work with any search engine - it just modifies indexed text for pages in File: namespace.