Julien Lemoine wrote:
Steve Bennett wrote:
Just one more request: what about if the search string matches *within* the phrase, but not necessarily at the start. Like "Hardy" matching "Laurel and Hardy"? Is that possible with this data structure?
No it will not. But I don't think this will be really usefull : Imagine a query with a single letter ("a" or "e" for example), it will matches a least half articles without understanding the results.
Best Regards. Julien Lemoine
I wrote a "suggest" function for finding part numbers for a request form at work - enabling multi-word search is really useful and powerful. I haven't looked at your code, so I don't know if you're using a keyword index, or LIKE SQL functions. I set mine up to require at least 4 characters and to to break on whitespace, so the SQL (pseudocode) is WHERE LIKE(wordone) AND LIKE(wordtwo) and so on.
Best Regards, Aerik
Aerik Sylvan wrote:
I wrote a "suggest" function for finding part numbers for a request form at work - enabling multi-word search is really useful and powerful. I haven't looked at your code, so I don't know if you're using a keyword index, or LIKE SQL functions. I set mine up to require at least 4 characters and to to break on whitespace, so the SQL (pseudocode) is WHERE LIKE(wordone) AND LIKE(wordtwo) and so on.
This is, of course, the most obvious and most naive solution. For a data set as large as Wikipedia's list of article titles, this is far too slow and inefficient and would kill the site for everyone if it was used on the live database.
Timwi
On 8/4/06, Timwi timwi@gmx.net wrote:
This is, of course, the most obvious and most naive solution. For a data set as large as Wikipedia's list of article titles, this is far too slow and inefficient and would kill the site for everyone if it was used on the live database.
Thus far, I don't think anyone has actually suggested using the suggest mechanism on the "live database".
Steve
Steve Bennett wrote:
On 8/4/06, Timwi timwi@gmx.net wrote:
This is, of course, the most obvious and most naive solution. For a data set as large as Wikipedia's list of article titles, this is far too slow and inefficient and would kill the site for everyone if it was used on the live database.
Thus far, I don't think anyone has actually suggested using the Suggest mechanism on the "live database".
I think Timwi was not talking about updating the suggest in live but to put the service on the same host/network. So the suggest needs to be very fast and use few cpu power.
Best Regards. Julien Lemoine
Hello Aerik,
Aerik Sylvan wrote:
I wrote a "suggest" function for finding part numbers for a request form at work - enabling multi-word search is really useful and powerful. I haven't looked at your code, so I don't know if you're using a keyword index, or LIKE SQL functions. I set mine up to require at least 4 characters and to to break on whitespace, so the SQL (pseudocode) is WHERE LIKE(wordone) AND LIKE(wordtwo) and so on.
With this solution, you will not be able to handle a lot of queries per second (it will probably take more than 1 second to query 2.3 millions of entries). If you use a trie (yes I know this word now :)), everything is pre-computed and a query will use very few cpu and you will be able to handle a lot of queries per second.
Best Regards. Julien Lemoine
wikitech-l@lists.wikimedia.org