On Tue, Nov 24, 2015 at 11:26 PM, Leila Zia <leila(a)wikimedia.org> wrote:
It's worth mentioning:
Dominant search engines do not rely on one source of information to surface
results, they get information from many sources, weigh the responses they
get based on the trust on the sources and many other factors, and aggregate
to find the best answer to be shown to the user.
Have you never seen Google display gross Wikipedia vandalism?[1][2] Cases
like that make it very clear that the Wikimedia content in question entered
Google directly, without human oversight or cross-checking against other
sources. What you describe sounds good, but it didn't happen.
If even transient vandalism passes through (the Finnish vandalism was
reportedly deleted in Wikipedia within minutes), then so can more subtle
and long-lived errors and falsehoods.
Similarly, Bing Satori's timeline is simply made up of verbatim Wikipedia
sentences containing a numerical year.
We know far too little about how search engines import Wikipedia and
Wikidata content, and what proportion of content is checked and how.
I just used "chicken pox" as a search query
in Google, I see an information
box on the right-hand-side of the page about the disease, and when I click
on Sources I get this page
<
https://support.google.com/websearch/answer/2364942?p=medical_conditions&am…
("See where we found the medical
information") which shows all the sources
Google has used to retrieve information about chicken pox from, nothing in
that list starts with wiki. Of course, this is not the case for all search
queries, for some of them, Google still uses Wikipedia snippets.
For medical queries, Google (rightly) prefers other sources, so those
queries are not presently affected.
[1]
https://www.seroundtable.com/google-world-series-cardinals-blunder-17587.ht…
[2]
https://commons.wikimedia.org/wiki/File:Wikipedia_vandalism_in_Google_infob…