[Mediawiki-l] Indexing Wikipedia with Solr/Lucene

Platonides Platonides at gmail.com
Sun May 13 19:06:32 UTC 2012


On 13/05/12 20:53, vineet yadav wrote:
> Hi all,
> I want to create Lucene/Solr index of wikipedia xml dump. I used Solr
> example(http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia)
> to index wikipedia xml dump. Since in wikipedia, Category and external
> links are part of wikipedia text, I am not able to index category and
> external links separately.     I want to index  Category, Externals
> links etc separately and store them in separate fields.
> Would anyone please be kind enough to give me a bit of advice?
> Thanks
> Vineet Yadav

Does it help that they are also available in separate tables?




More information about the MediaWiki-l mailing list