On 8/3/06, Julien Lemoine speedblue@happycoders.org wrote:
- How do you handle pages with the same title, but different
capitalization? They're rare, but they do occur. My suspicion from scanning Analyzer.cpp is that you just take the most popular. However, if it's for search, it would be best to include everything (I think).
Yeah you are right, the most popular is keept. I wanted to have a case insensitive search and for the moment I have only one article by final node. But it will be better, I also added it to my TODO list.
I didn't notice this - yes, a case-sensitive search is critical. Problems with casing are one of the most likely reasons a user will fail to find a given page. MediaWiki automatically finds the three most common casing options - all lower, ALL CAPS, Title Caps - but misses the likely "Caps of Important Words".
I will generates a new index with redirects, I will give you the size before/after this evening.
Redirects should be indicated with what they redirect to, like:
... Top-rope climbing -> top roping Top roping ...
Or something - but maybe with the proviso that it doesn't show a redirect if the actual target is currently being displayed.
Steve