Trey Jones, 24/07/2015 23:16:
I've partially analyzed a sample of 100K zero-result full_text searches against enwiki, over the course of about an hour
Great! Can't wait for the same across all languages.
*TL;DR Summary: If these patterns hold for another sample (and across languages), we should be able to get some decent mileage out of these simple approaches:* *- find sources of weird patterns and either ignore them, or contact the source and redirect them to a more appropriate destination* *- use language or character set detection to redirect queries to other wikis*
Yes, that's the most important one. Crosswiki searches are possible with Cirrus: https://phabricator.wikimedia.org/T46420
T26767 may go a long way and T3837 may be fixed by following the same approach as for sister projects (boxes on Special:Search for crosswiki results).
*- filter the term "quot" from queries*
quot or " ? Som
*- filter 14###########: from the front of queries*
UNIX time? 1437794758
*- replace _ with space in queries*
Do you have actual URL of the search? [[Special:Search/this_format]] is supposed to ignore underscores already.
Also, a full text search URL containing "search=" but not "fulltext=" is first of all a title search (the aim is to be redirected to the title if it exists).
- lots of queries for books, articles, movies, tv, mp3s, and porn (in
multiple languages)
Titles and other proper names are perfect for crosswiki search!
For the incremental searches, do we know if all browsers which integrate a Wikipedia search bar use the correct API for suggestions?
Nemo