On Jan 25, 2008 6:11 PM, Brion Vibber brion@wikimedia.org wrote:
Thomas Dalton wrote:
This has come up before - if memory serves, the excuse given was that IE doesn't always show custom error pages (I think there is a minimum size, although I can't see how the Mediawiki pages would be shorter than that). It's something that bothers me as well - a page saying "page not found" really should return a 404 error code...
*nod*
See http://bugzilla.wikimedia.org/show_bug.cgi?id=2585
There were difficult to track down errors at the time we originally tried this, but we may have done it wrong (eg without ensuring the minimum object size for IE...) or else there may have been some sort of proxy issue that we didn't track down correctly.
For what it's worth, LiveJournal's used the "huge HTML comment" hack for error pages for a long time without (as far as know) problems. Examine e.g. http://news.livejournal.com/friends/nonesuch -- hopefully you see the server-provided error text on all browsers. (But the user reach is much less than Wikipedia and it's not as important that this text is displayed as Wikipedia's is.)
Personally, I'd love it if Wikipedia had more informative error codes. My understanding is that Googlebot has use all sorts of heuristics to guess at whether a given page is "actually" a 404 or not, and heuristic guessing is what got us into the above mess in the first place.
You might find it amusing to use Code Search to search for the string "Wikipedia does not have an article with this exact name" to see others who've been confronted with this problem: http://www.google.com/codesearch?hl=en&lr=&q=%22Wikipedia+does+not+h... (Of course, none of those solutions work for non-English wikis...)
Another, more complicated instance of this is observable at http://www.google.com/search?q=nitty+gritty , where result #6 is: "Nitty gritty - Wikipedia, the free encyclopedia Wikipedia does not currently have an encyclopedia article for Nitty gritty. You may want to search Wiktionary for "Nitty gritty" instead. ..." If that were instead a redirect, Google could pick up the useful Wiktionary article rather than the useless Wikipedia page.