Hi all,
If you do a Google search and look at the Wikipedia results, e.g. https://www.google.com.au/search?q=Malvinas+Argentinas+International+Airport... you will see that the results state:
"DEAR WIKIPEDIA READERS: You're probably busy, so we'll get right to it. This week we ask our readers to help us. To protect our independence from corporate"
Instead of the article information.
It doesn't sit right with me that fundraising is interfering with Google results, and even moreso due to it stating "to protect our independence from corporate...."
Is there some way that this can be prevented, short of not using Google?
Russavia
I think Erik emailed last week to this list that this was an unintended side effect they are trying to solve. So I guess patience is the answer here.
Lodewijk
On Sun, Dec 7, 2014 at 11:23 AM, Russavia russavia.wikipedia@gmail.com wrote:
Hi all,
If you do a Google search and look at the Wikipedia results, e.g.
https://www.google.com.au/search?q=Malvinas+Argentinas+International+Airport... you will see that the results state:
"DEAR WIKIPEDIA READERS: You're probably busy, so we'll get right to it. This week we ask our readers to help us. To protect our independence from corporate"
Instead of the article information.
It doesn't sit right with me that fundraising is interfering with Google results, and even moreso due to it stating "to protect our independence from corporate...."
Is there some way that this can be prevented, short of not using Google?
Russavia _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Russavia asked me to check this to confirm it wasn't just him or his regional Google setup, and it's both correct and looking into it further it's hitting every single page on Wikipedia that Google has indexed.
If you search for "DEAR WIKIPEDIA READERS: You're probably busy, so we'll get right to it. This week we ask our readers to help us. This week we ask our readers to protect our site:en.wikipedia.org" we're both getting 6,100,000 results.
If you take, at random, some pages for that search result, and then try to find those pages through a fairly typical, sensible search result using the page title or keywords in the article, some search text results show the fundraising banner text, and other pages show a relevant text excerpt from the page.
I'll pass this on to the developers too, but hopefully this helps here too.
Nick
On 7 December 2014 at 10:23, Russavia russavia.wikipedia@gmail.com wrote:
Hi all,
If you do a Google search and look at the Wikipedia results, e.g.
https://www.google.com.au/search?q=Malvinas+Argentinas+International+Airport... you will see that the results state:
"DEAR WIKIPEDIA READERS: You're probably busy, so we'll get right to it. This week we ask our readers to help us. To protect our independence from corporate"
Instead of the article information.
It doesn't sit right with me that fundraising is interfering with Google results, and even moreso due to it stating "to protect our independence from corporate...."
Is there some way that this can be prevented, short of not using Google?
Russavia _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On Sun, Dec 7, 2014 at 5:43 PM, Nick Birse wp@nbir.se wrote:
Russavia asked me to check this to confirm it wasn't just him or his regional Google setup, and it's both correct and looking into it further it's hitting every single page on Wikipedia that Google has indexed.
If you search for "DEAR WIKIPEDIA READERS: You're probably busy, so we'll get right to it. This week we ask our readers to help us. This week we ask our readers to protect our site:en.wikipedia.org" we're both getting 6,100,000 results.
If you take, at random, some pages for that search result, and then try to find those pages through a fairly typical, sensible search result using the page title or keywords in the article, some search text results show the fundraising banner text, and other pages show a relevant text excerpt from the page.
I'll pass this on to the developers too, but hopefully this helps here too.
The devs have been aware since December 4, based on the date https://phabricator.wikimedia.org/T76743 was opened.
On 7 December 2014 at 11:08, John Mark Vandenberg jayvdb@gmail.com wrote: ...
The devs have been aware since December 4, based on the date https://phabricator.wikimedia.org/T76743 was opened.
Wow, 8 million returns on Google. Er, Lila, someone, how about making a decision to pause using fundraising banners until this is fixed or at least we understand why it is happening? It looks like a major global embarrassment from where I'm sitting and it really does not matter whether this is Google's fault or the Foundation's.
I'm finding the text "DEAR WIKIPEDIA READERS: This week we ask our readers to help us. To protect our independence, we'll never run ads. We survive on donations averaging ..." completely replacing all sorts of content when searching for the simplest educational material.
Fae
On 7 December 2014 at 12:19, Fæ faewik@gmail.com wrote:
Wow, 8 million returns on Google. Er, Lila, someone, how about making a decision to pause using fundraising banners until this is fixed or at least we understand why it is happening?
See Erik's comment somewhere in this kilometre-long thread on the fundraising banners:
"This came to our attention this morning SF time, and we quickly deployed fixes on our end:
https://gerrit.wikimedia.org/r/#/c/177598/ https://gerrit.wikimedia.org/r/#/c/177611/
This should fix the issue, but Google will need to recrawl the affected pages. We've already reached out to our contacts there to see if this can be done more quickly."
Best, Patrik
Thanks John for the link.
I've made an edit to https://en.wikipedia.org/wiki/Ushuaia_%E2%80%93_Malvinas_Argentinas_Internat... as I've been told that Google will update text in their search results when articles are created and edited. Is that correct? If so, how long will the "fundraising" text potentially be appearing in Google results for you think?
I can confirm that https://en.wikipedia.org/wiki/Berry_and_MacFarlane_Monument is displaying correctly in Google results.
Cheers,
Russavia
On Sun, Dec 7, 2014 at 7:08 PM, John Mark Vandenberg jayvdb@gmail.com wrote:
On Sun, Dec 7, 2014 at 5:43 PM, Nick Birse wp@nbir.se wrote:
Russavia asked me to check this to confirm it wasn't just him or his regional Google setup, and it's both correct and looking into it further it's hitting every single page on Wikipedia that Google has indexed.
If you search for "DEAR WIKIPEDIA READERS: You're probably busy, so we'll get right to it. This week we ask our readers to help us. This week we
ask
our readers to protect our site:en.wikipedia.org" we're both getting 6,100,000 results.
If you take, at random, some pages for that search result, and then try
to
find those pages through a fairly typical, sensible search result using
the
page title or keywords in the article, some search text results show the fundraising banner text, and other pages show a relevant text excerpt
from
the page.
I'll pass this on to the developers too, but hopefully this helps here
too.
The devs have been aware since December 4, based on the date https://phabricator.wikimedia.org/T76743 was opened.
-- John Vandenberg
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On Sun, Dec 7, 2014 at 7:15 PM, Russavia russavia.wikipedia@gmail.com wrote:
Thanks John for the link.
I've made an edit to https://en.wikipedia.org/wiki/Ushuaia_%E2%80%93_Malvinas_Argentinas_Internat... as I've been told that Google will update text in their search results when articles are created and edited. Is that correct? If so, how long will the "fundraising" text potentially be appearing in Google results for you think?
I can confirm that https://en.wikipedia.org/wiki/Berry_and_MacFarlane_Monument is displaying correctly in Google results.
This is a great opportunity to encourage people to edit ...
How about a banner ..
"If everyone edited just one page today, the google search result snippets would be fixed in 2 hours."
Hi all,
For the record, we've been able to confirm that our fixes, which were already deployed Thursday, immediately addressed the issue on our end. Google also picked up the updated robots.txt already on December 4, according to Google Webmaster Tools. GoogleBot, for better or for worse, nowadays executes JavaScript, which caused it to index the banner text since the JS was not blacklisted prior to December 4. We've pinged our Google contacts about faster re-crawling of impacted pages; will follow up further on that front.
Erik
Thanks for the update Erik
I can confirm that my edit to https://en.wikipedia.org/wiki/Ushuaia_%E2%80%93_Malvinas_Argentinas_Internat... has now fixed the issue in Google search as it relates to that article, but the issue still remains on 8,600,000 articles (up from 8,540,000 articles yesterday).
Cheers
Russavia
On Mon, Dec 8, 2014 at 3:12 AM, Erik Moeller erik@wikimedia.org wrote:
Hi all,
For the record, we've been able to confirm that our fixes, which were already deployed Thursday, immediately addressed the issue on our end. Google also picked up the updated robots.txt already on December 4, according to Google Webmaster Tools. GoogleBot, for better or for worse, nowadays executes JavaScript, which caused it to index the banner text since the JS was not blacklisted prior to December 4. We've pinged our Google contacts about faster re-crawling of impacted pages; will follow up further on that front.
Erik
Erik Möller VP of Product & Strategy, Wikimedia Foundation
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On Sun, Dec 7, 2014 at 2:10 PM, Russavia russavia.wikipedia@gmail.com wrote:
I can confirm that my edit to https://en.wikipedia.org/wiki/Ushuaia_%E2%80%93_Malvinas_Argentinas_Internat... has now fixed the issue in Google search as it relates to that article, but the issue still remains on 8,600,000 articles (up from 8,540,000 articles yesterday).
site:wikipedia.org "Dear Wikipedia readers" produces 936,000 results for me. Please note that Google uses a distributed index, and depending where you are geographically, and where Google sends you based on server load, you will get inconsistent results from query to query. See this paper for a bit more detail on how these index inconsistencies manifest:
http://cseweb.ucsd.edu/~snoeren/papers/bobble-pam14.pdf
Pages we know to have been re-crawled don't exhibit the issue, so it should be only a matter of time for the index to catch up. Please also note that the text being in the index does not automatically mean that it will show up in a typical search. Any search for the phrase itself will highlight it in the snippet (extract) shown in the search result page as a match, while a typical search will not include the phrase and will much less frequently identify the text to be a good match for the user's search query, mitigating global user impact significantly. We'd still like to resolve this completely as quickly as possible, of course.
Erik
Thanks for the thorough updates Erik :-) On 7 Dec 2014 23:11, "Erik Moeller" erik@wikimedia.org wrote:
On Sun, Dec 7, 2014 at 2:10 PM, Russavia russavia.wikipedia@gmail.com wrote:
I can confirm that my edit to
https://en.wikipedia.org/wiki/Ushuaia_%E2%80%93_Malvinas_Argentinas_Internat...
has now fixed the issue in Google search as it relates to that article,
but
the issue still remains on 8,600,000 articles (up from 8,540,000 articles yesterday).
site:wikipedia.org "Dear Wikipedia readers" produces 936,000 results for me. Please note that Google uses a distributed index, and depending where you are geographically, and where Google sends you based on server load, you will get inconsistent results from query to query. See this paper for a bit more detail on how these index inconsistencies manifest:
http://cseweb.ucsd.edu/~snoeren/papers/bobble-pam14.pdf
Pages we know to have been re-crawled don't exhibit the issue, so it should be only a matter of time for the index to catch up. Please also note that the text being in the index does not automatically mean that it will show up in a typical search. Any search for the phrase itself will highlight it in the snippet (extract) shown in the search result page as a match, while a typical search will not include the phrase and will much less frequently identify the text to be a good match for the user's search query, mitigating global user impact significantly. We'd still like to resolve this completely as quickly as possible, of course.
Erik
Erik Möller VP of Product & Strategy, Wikimedia Foundation
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org