SearchEngine subclasses can implement
getTextFromContent() if they want
to override the normal text fetching behavior.
I can't put it into SearchEngine subclass because Tika isn't a search
engine, it's rather a java application that runs separately and extracts
text from binary files like *.doc, *.pdf and so on.
TikaMW is a plugin that should work with any search engine - it just
modifies indexed text for pages in File: namespace.
--
With best regards,
Vitaliy Filippov