Hello,
I implemented a fulltext search in wiki by using the Lucene classes to index and search the wiki SQL Tables. At the moment I simply use the following query to select the articles I want to search in:
"SELECT * FROM cur WHERE cur_namespace=0"
To prevent finding already deleted articles I compare the results with the results I get using the following query:
"SELECT * FROM archive WHERE ar_title='searchterm'"
where "searchterm" is replaced once by every result I get with the first query. I use a Java-Application to throw out all results that appear in both tables.
But for some reason a few articles are not contained in the cur table, but they are appearing as articles in wiki. These articles can be found in a third table called "searchindex". I cannot use this table for searching because its not possible to filter it by namespaces like it is in the cur table.
To get to the point: My problem is, that I cannot figure out a query that selects not only all articles with a zero namespace in the cur table, but also those which are only appearing in the searchindex table. I already tried the query recommended by the mediawiki database documentation of the searchindex table, but it cant find these few articles with this query either.
Greetings Marcel
Homer Simpson wrote:
I implemented a fulltext search in wiki by using the Lucene classes to index and search the wiki SQL Tables. At the moment I simply use the following query to select the articles I want to search in:
"SELECT * FROM cur WHERE cur_namespace=0"
There is no "cur" table in current versions of MediaWiki. It's obsolete, and left over from an upgrade from an old version.
To prevent finding already deleted articles I compare the results with the results I get using the following query:
"SELECT * FROM archive WHERE ar_title='searchterm'"
This is a poor practice, and also your query is wrong.
Check page, and don't forget to use page_namespace.
-- brion vibber (brion @ pobox.com)
Homer Simpson wrote:
I implemented a fulltext search in wiki by using the Lucene classes to
index
and search the wiki SQL Tables. At the moment I simply use the following query to select the articles I want to search in:
"SELECT * FROM cur WHERE cur_namespace=0"
There is no "cur" table in current versions of MediaWiki. It's obsolete, and left over from an upgrade from an old version.
To prevent finding already deleted articles I compare the results with
the
results I get using the following query:
"SELECT * FROM archive WHERE ar_title='searchterm'"
This is a poor practice, and also your query is wrong.
Check page, and don't forget to use page_namespace.
Thanks, I did'nt think about left over tables from older versions. I am at home, so I can't check the tables, but I don't remember any page_namespace field, but I'll check that again.
Again thanks for your fast answer Marcel
wikitech-l@lists.wikimedia.org