On Tue, Nov 24, 2015 at 11:26 PM, Leila Zia leila@wikimedia.org wrote:
It's worth mentioning:
Dominant search engines do not rely on one source of information to surface results, they get information from many sources, weigh the responses they get based on the trust on the sources and many other factors, and aggregate to find the best answer to be shown to the user.
Have you never seen Google display gross Wikipedia vandalism?[1][2] Cases like that make it very clear that the Wikimedia content in question entered Google directly, without human oversight or cross-checking against other sources. What you describe sounds good, but it didn't happen.
If even transient vandalism passes through (the Finnish vandalism was reportedly deleted in Wikipedia within minutes), then so can more subtle and long-lived errors and falsehoods.
Similarly, Bing Satori's timeline is simply made up of verbatim Wikipedia sentences containing a numerical year.
We know far too little about how search engines import Wikipedia and Wikidata content, and what proportion of content is checked and how.
I just used "chicken pox" as a search query in Google, I see an information box on the right-hand-side of the page about the disease, and when I click on Sources I get this page < https://support.google.com/websearch/answer/2364942?p=medical_conditions&...
("See where we found the medical information") which shows all the sources Google has used to retrieve information about chicken pox from, nothing in that list starts with wiki. Of course, this is not the case for all search queries, for some of them, Google still uses Wikipedia snippets.
For medical queries, Google (rightly) prefers other sources, so those queries are not presently affected.
[1] https://www.seroundtable.com/google-world-series-cardinals-blunder-17587.htm... [2] https://commons.wikimedia.org/wiki/File:Wikipedia_vandalism_in_Google_infobo...