Hello,
I implemented a fulltext search in wiki by using the Lucene classes to index and search the wiki SQL Tables. At the moment I simply use the following query to select the articles I want to search in:
"SELECT * FROM cur WHERE cur_namespace=0"
To prevent finding already deleted articles I compare the results with the results I get using the following query:
"SELECT * FROM archive WHERE ar_title='searchterm'"
where "searchterm" is replaced once by every result I get with the first query. I use a Java-Application to throw out all results that appear in both tables.
But for some reason a few articles are not contained in the cur table, but they are appearing as articles in wiki. These articles can be found in a third table called "searchindex". I cannot use this table for searching because its not possible to filter it by namespaces like it is in the cur table.
To get to the point: My problem is, that I cannot figure out a query that selects not only all articles with a zero namespace in the cur table, but also those which are only appearing in the searchindex table. I already tried the query recommended by the mediawiki database documentation of the searchindex table, but it cant find these few articles with this query either.
Greetings Marcel
mediawiki-l@lists.wikimedia.org