First, some context:
I was in Philadelphia for the Democratic National Convention earlier this week, where I had been invited to speak (in a small side event) about connectivity and global development. I spoke about our work in the languages of the developing world, and made a point to say that bad laws in the developed world which might hurt our work can be damaging for the development of the Internet in the rest of the world and urged lawmakers to not just think of various Internet legal questions as being "Silicon Valley versus Hollywood" but to understand that they impact how our volunteer community and many other ordinary people online.
Second, the story:
The main conference was held in the [[Wells Fargo Center (Philadelphia)]], an indoor arena where basketball and hockey teams play normally.
A journalist friend said to me that he "finally found something that Wikipedia doesn't have" and he was surprised. What was that, I said? "The history of Wells Fargo". What?!! Really?!! That seemed impossible to me. He said we have an article about Wells Fargo that seems to be mostly about the contemporary bank, and when you search for Wells Fargo history there's also an article about the Wells Fargo History Museum.
I popped on my phone and used my own personal preferred method of finding things in Wikipedia: Google. I typed in "Wells Fargo history" and sure enough, the first two links are history pages from their official websites and the third link is Wikipedia - a normal state of affairs. He started to apologize for raising a false alarm
I asked him for more details on exactly how he searched, and explained that I regard it to be very sad if some volunteers spend hundreds of hours working on an article, painstakingly going over tons of details in an effort to get it right, and then someone couldn't find it.
Here's what he did - and I replicated the steps and all was clear.
Go to http://www.wikipedia.org/
Make sure the dropdown in the search box is set to 'EN' - which it would have been for him.
Start typing 'Wells Fargo history' and watch as the dropdown selections narrow. You'll have the experience that he had - you'll see the bank article prominently featured and then various buildings (they have a habit of sponsoring sports arenas in various US cities) and finally as you start typing history it focuses in on the History Museum.
If you don't choose any of those, then hit enter, you'll get to the search results page. This is the one with a huge box of options at the top (which will be confusing and frightening to people who aren't already wikipedians) and then by my count the desired article is 13th on the page: [[History of Wells Fargo]].
Now, I strongly suspect this could be fixed by making a redirect from [[Wells Fargo history]] to [[History of Wells Fargo]].
Or a more serious fix could be had if the search engine understood that very very often in English [[X of Y]] can be written [[Y X]]. ([[List of French monarchs]] becomes [[French monarchs list]], see: https://en.wikipedia.org/wiki/Special:Search?search=french+monarchs+list where the desired article is in 10th place.
But my point is not to argue for any specific fix. My point is to illustrate that there is a real problem with search, that it is impacting users, and that we should invest in fixing it.
--Jimbo
We recently had a thread in the Wikipedia Weekly Facebook group, where we pretty much concluded the reason why we don’t have word in English for “looked it up in Wikipedia” is because that word is “Googled it.” :)
https://www.facebook.com/groups/wikipediaweekly/permalink/1050447111669786/
-Andrew
On Thu, Jul 28, 2016 at 8:09 AM, Jimmy Wales jimmywales@wikia-inc.com wrote:
First, some context:
I was in Philadelphia for the Democratic National Convention earlier this week, where I had been invited to speak (in a small side event) about connectivity and global development. I spoke about our work in the languages of the developing world, and made a point to say that bad laws in the developed world which might hurt our work can be damaging for the development of the Internet in the rest of the world and urged lawmakers to not just think of various Internet legal questions as being "Silicon Valley versus Hollywood" but to understand that they impact how our volunteer community and many other ordinary people online.
Second, the story:
The main conference was held in the [[Wells Fargo Center (Philadelphia)]], an indoor arena where basketball and hockey teams play normally.
A journalist friend said to me that he "finally found something that Wikipedia doesn't have" and he was surprised. What was that, I said? "The history of Wells Fargo". What?!! Really?!! That seemed impossible to me. He said we have an article about Wells Fargo that seems to be mostly about the contemporary bank, and when you search for Wells Fargo history there's also an article about the Wells Fargo History Museum.
I popped on my phone and used my own personal preferred method of finding things in Wikipedia: Google. I typed in "Wells Fargo history" and sure enough, the first two links are history pages from their official websites and the third link is Wikipedia - a normal state of affairs. He started to apologize for raising a false alarm
I asked him for more details on exactly how he searched, and explained that I regard it to be very sad if some volunteers spend hundreds of hours working on an article, painstakingly going over tons of details in an effort to get it right, and then someone couldn't find it.
Here's what he did - and I replicated the steps and all was clear.
Go to http://www.wikipedia.org/
Make sure the dropdown in the search box is set to 'EN' - which it would have been for him.
Start typing 'Wells Fargo history' and watch as the dropdown selections narrow. You'll have the experience that he had - you'll see the bank article prominently featured and then various buildings (they have a habit of sponsoring sports arenas in various US cities) and finally as you start typing history it focuses in on the History Museum.
If you don't choose any of those, then hit enter, you'll get to the search results page. This is the one with a huge box of options at the top (which will be confusing and frightening to people who aren't already wikipedians) and then by my count the desired article is 13th on the page: [[History of Wells Fargo]].
Now, I strongly suspect this could be fixed by making a redirect from [[Wells Fargo history]] to [[History of Wells Fargo]].
Or a more serious fix could be had if the search engine understood that very very often in English [[X of Y]] can be written [[Y X]]. ([[List of French monarchs]] becomes [[French monarchs list]], see: https://en.wikipedia.org/wiki/Special:Search?search=french+monarchs+list where the desired article is in 10th place.
But my point is not to argue for any specific fix. My point is to illustrate that there is a real problem with search, that it is impacting users, and that we should invest in fixing it.
--Jimbo
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
One risk of using Google to search Wikipedia is getting bad results. For several weeks, a Google search for "gender" returned a disruptive edit[1] that replaced the entire article with " There are only 2 genders. Male and Female." That edit, from May of this year, was only live for a few minutes, but got cached in Google somehow, resulting in this (mis)information being prominently displayed near the top of the search results. Only recently has a search on that term begun returning the updated page (which is now semi-protected through June 2017 due to excessive vandalism.)
- Pax
[1] https://en.wikipedia.org/w/index.php?title=Gender&oldid=722247975
On 28 July 2016 at 13:09, Jimmy Wales jimmywales@wikia-inc.com wrote:
A journalist friend said to me that he "finally found something that Wikipedia doesn't have" and he was surprised. What was that, I said? "The history of Wells Fargo".
Go to http://www.wikipedia.org/
Make sure the dropdown in the search box is set to 'EN' - which it would have been for him.
[...]
the search results page. This is the one with a huge box of options at the top (which will be confusing and frightening to people who aren't already wikipedians)
I think that depends on the skin used. When logged put (so with default skin) I get a toolbar with the options to which your refer hidden behind its "advanced search" option.
and then by my count the desired article is 13th on the page: [[History of Wells Fargo]].
Now, I strongly suspect this could be fixed by making a redirect from [[Wells Fargo history]] to [[History of Wells Fargo]].
I've made the former a disambiguation page (not a redirect) linking to:
* The [[History of Wells Fargo]] * The [[Wells Fargo History Museum]]
You could have done that, too!
On 7/28/16 9:04 AM, Andy Mabbett wrote:
I think that depends on the skin used. When logged put (so with default skin) I get a toolbar with the options to which your refer hidden behind its "advanced search" option.
Ah, that makes sense and is good.
I've made the former a disambiguation page (not a redirect) linking to:
- The [[History of Wells Fargo]]
- The [[Wells Fargo History Museum]]
You could have done that, too!
:-) Sure, but the point is that there will always be cases like this unless we invest in improving search more generally.
On 28 Jul 2016, at 17:17, Jimmy Wales jimmywales@ymail.com wrote:
On 7/28/16 9:04 AM, Andy Mabbett wrote:
I've made the former a disambiguation page (not a redirect) linking to:
- The [[History of Wells Fargo]]
- The [[Wells Fargo History Museum]]
You could have done that, too!
:-) Sure, but the point is that there will always be cases like this unless we invest in improving search more generally.
I'd have approached this search question by looking for the 'Wells Fargo' article and looking for the history section in that article - and finding a specific article on the history of the company would be a bonus.
So partly, this is reflecting the fact that if you just search for 'Wells Fargo' on most search engines, you will be taken to a homepage for the organisation that doesn't cover the history, so you need to add 'history' on the end of the search query in order to be given a link to a history page on the corporate website. Whereas Wikipedia provides at least a summary of the history in the article about the organisation, making it easier to find - you end up on the page that contains the info you're after without having to add the extra key word.
Our search engine can definitely be significantly improved - in this case, either to find the specific article, or to point towards the history section of the main article. But search can endlessly be improved (as demonstrated by how much work Google's put into its search engine over the last decade). Plus we also have to work against some of the habits that have been ingrained in users by the way that other parts of the web work, where things can be a bit easier to find on Wikipedia than users might expect.
Thanks, Mike
We recently had a huge amount of discussion about the importance of search, on this list and elsewhere. My strong takeaway from that was, nobody disagrees with the position you're advocating here, Jimmy - that our search is problematic, and is worth investing in.
The only directly related ideas that *are* controversial, as I understand it, are that (1) an investment approaching $100 million might not be an easy sell to some (myself included), and (2) that such an investment should be publicly vetted prior to taking strong steps in that direction.
Did you hear something different?
Pete [[User:Peteforsyth]]
P.S. in the specific example, your friend would have had no trouble if he were doing serious research. The "History of WF" article is linked prominently at the top of the relevant section of the "WF" article. That's not to say we should consider the low search ranking acceptable, but it might inform how this specific example speaks to urgency.
On Jul 28, 2016 5:09 AM, "Jimmy Wales" jimmywales@wikia-inc.com wrote:
First, some context:
I was in Philadelphia for the Democratic National Convention earlier this week, where I had been invited to speak (in a small side event) about connectivity and global development. I spoke about our work in the languages of the developing world, and made a point to say that bad laws in the developed world which might hurt our work can be damaging for the development of the Internet in the rest of the world and urged lawmakers to not just think of various Internet legal questions as being "Silicon Valley versus Hollywood" but to understand that they impact how our volunteer community and many other ordinary people online.
Second, the story:
The main conference was held in the [[Wells Fargo Center (Philadelphia)]], an indoor arena where basketball and hockey teams play normally.
A journalist friend said to me that he "finally found something that Wikipedia doesn't have" and he was surprised. What was that, I said? "The history of Wells Fargo". What?!! Really?!! That seemed impossible to me. He said we have an article about Wells Fargo that seems to be mostly about the contemporary bank, and when you search for Wells Fargo history there's also an article about the Wells Fargo History Museum.
I popped on my phone and used my own personal preferred method of finding things in Wikipedia: Google. I typed in "Wells Fargo history" and sure enough, the first two links are history pages from their official websites and the third link is Wikipedia - a normal state of affairs. He started to apologize for raising a false alarm
I asked him for more details on exactly how he searched, and explained that I regard it to be very sad if some volunteers spend hundreds of hours working on an article, painstakingly going over tons of details in an effort to get it right, and then someone couldn't find it.
Here's what he did - and I replicated the steps and all was clear.
Go to http://www.wikipedia.org/
Make sure the dropdown in the search box is set to 'EN' - which it would have been for him.
Start typing 'Wells Fargo history' and watch as the dropdown selections narrow. You'll have the experience that he had - you'll see the bank article prominently featured and then various buildings (they have a habit of sponsoring sports arenas in various US cities) and finally as you start typing history it focuses in on the History Museum.
If you don't choose any of those, then hit enter, you'll get to the search results page. This is the one with a huge box of options at the top (which will be confusing and frightening to people who aren't already wikipedians) and then by my count the desired article is 13th on the page: [[History of Wells Fargo]].
Now, I strongly suspect this could be fixed by making a redirect from [[Wells Fargo history]] to [[History of Wells Fargo]].
Or a more serious fix could be had if the search engine understood that very very often in English [[X of Y]] can be written [[Y X]]. ([[List of French monarchs]] becomes [[French monarchs list]], see: https://en.wikipedia.org/wiki/Special:Search?search=french+monarchs+list where the desired article is in 10th place.
But my point is not to argue for any specific fix. My point is to illustrate that there is a real problem with search, that it is impacting users, and that we should invest in fixing it.
--Jimbo
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
On 7/28/16 11:53 AM, Pete Forsyth wrote:
We recently had a huge amount of discussion about the importance of search, on this list and elsewhere. My strong takeaway from that was, nobody disagrees with the position you're advocating here, Jimmy - that our search is problematic, and is worth investing in.
The only directly related ideas that *are* controversial, as I understand it, are that (1) an investment approaching $100 million might not be an easy sell to some (myself included), and (2) that such an investment should be publicly vetted prior to taking strong steps in that direction.
Did you hear something different?
No, and that's exactly my view.
Hey Jimmy,
Thanks for the report. This problem is one that we've been aware of in Discovery for quite some time. It actually serves as a good example of a typical problem that we face in improving search: we know there's an issue with a small subset of searches, and could fix this problem easily with a hack, but that hack would make as many searches worse as it makes better. Meanwhile, better solutions take much more time.
But, I have good news! This quarter one of Discovery's goals [1] is to work on a proper solution to this very problem. We previously studied the problem in detail [2]. Now, following on from our upgrade of Elasticsearch last quarter, we're hoping that switching us over to BM25 [3] will fix many of these relevance issues, and we're investigating that more right now. Stay tuned!
Thanks, Dan
[1]: https://www.mediawiki.org/wiki/Wikimedia_Engineering/2016-17_Q1_Goals#Discov... [2]: https://phabricator.wikimedia.org/T125083 [2]: https://en.wikipedia.org/wiki/Okapi_BM25
On 28 Jul 2016 5:09 a.m., "Jimmy Wales" jimmywales@wikia-inc.com wrote:
First, some context:
I was in Philadelphia for the Democratic National Convention earlier this week, where I had been invited to speak (in a small side event) about connectivity and global development. I spoke about our work in the languages of the developing world, and made a point to say that bad laws in the developed world which might hurt our work can be damaging for the development of the Internet in the rest of the world and urged lawmakers to not just think of various Internet legal questions as being "Silicon Valley versus Hollywood" but to understand that they impact how our volunteer community and many other ordinary people online.
Second, the story:
The main conference was held in the [[Wells Fargo Center (Philadelphia)]], an indoor arena where basketball and hockey teams play normally.
A journalist friend said to me that he "finally found something that Wikipedia doesn't have" and he was surprised. What was that, I said? "The history of Wells Fargo". What?!! Really?!! That seemed impossible to me. He said we have an article about Wells Fargo that seems to be mostly about the contemporary bank, and when you search for Wells Fargo history there's also an article about the Wells Fargo History Museum.
I popped on my phone and used my own personal preferred method of finding things in Wikipedia: Google. I typed in "Wells Fargo history" and sure enough, the first two links are history pages from their official websites and the third link is Wikipedia - a normal state of affairs. He started to apologize for raising a false alarm
I asked him for more details on exactly how he searched, and explained that I regard it to be very sad if some volunteers spend hundreds of hours working on an article, painstakingly going over tons of details in an effort to get it right, and then someone couldn't find it.
Here's what he did - and I replicated the steps and all was clear.
Go to http://www.wikipedia.org/
Make sure the dropdown in the search box is set to 'EN' - which it would have been for him.
Start typing 'Wells Fargo history' and watch as the dropdown selections narrow. You'll have the experience that he had - you'll see the bank article prominently featured and then various buildings (they have a habit of sponsoring sports arenas in various US cities) and finally as you start typing history it focuses in on the History Museum.
If you don't choose any of those, then hit enter, you'll get to the search results page. This is the one with a huge box of options at the top (which will be confusing and frightening to people who aren't already wikipedians) and then by my count the desired article is 13th on the page: [[History of Wells Fargo]].
Now, I strongly suspect this could be fixed by making a redirect from [[Wells Fargo history]] to [[History of Wells Fargo]].
Or a more serious fix could be had if the search engine understood that very very often in English [[X of Y]] can be written [[Y X]]. ([[List of French monarchs]] becomes [[French monarchs list]], see: https://en.wikipedia.org/wiki/Special:Search?search=french+monarchs+list where the desired article is in 10th place.
But my point is not to argue for any specific fix. My point is to illustrate that there is a real problem with search, that it is impacting users, and that we should invest in fixing it.
--Jimbo
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Yay!
On 7/28/16 12:01 PM, Dan Garry wrote:
Hey Jimmy,
Thanks for the report. This problem is one that we've been aware of in Discovery for quite some time. It actually serves as a good example of a typical problem that we face in improving search: we know there's an issue with a small subset of searches, and could fix this problem easily with a hack, but that hack would make as many searches worse as it makes better. Meanwhile, better solutions take much more time.
But, I have good news! This quarter one of Discovery's goals [1] is to work on a proper solution to this very problem. We previously studied the problem in detail [2]. Now, following on from our upgrade of Elasticsearch last quarter, we're hoping that switching us over to BM25 [3] will fix many of these relevance issues, and we're investigating that more right now. Stay tuned!
Thanks, Dan
On 28 Jul 2016 5:09 a.m., "Jimmy Wales" jimmywales@wikia-inc.com wrote:
First, some context:
I was in Philadelphia for the Democratic National Convention earlier this week, where I had been invited to speak (in a small side event) about connectivity and global development. I spoke about our work in the languages of the developing world, and made a point to say that bad laws in the developed world which might hurt our work can be damaging for the development of the Internet in the rest of the world and urged lawmakers to not just think of various Internet legal questions as being "Silicon Valley versus Hollywood" but to understand that they impact how our volunteer community and many other ordinary people online.
Second, the story:
The main conference was held in the [[Wells Fargo Center (Philadelphia)]], an indoor arena where basketball and hockey teams play normally.
A journalist friend said to me that he "finally found something that Wikipedia doesn't have" and he was surprised. What was that, I said? "The history of Wells Fargo". What?!! Really?!! That seemed impossible to me. He said we have an article about Wells Fargo that seems to be mostly about the contemporary bank, and when you search for Wells Fargo history there's also an article about the Wells Fargo History Museum.
I popped on my phone and used my own personal preferred method of finding things in Wikipedia: Google. I typed in "Wells Fargo history" and sure enough, the first two links are history pages from their official websites and the third link is Wikipedia - a normal state of affairs. He started to apologize for raising a false alarm
I asked him for more details on exactly how he searched, and explained that I regard it to be very sad if some volunteers spend hundreds of hours working on an article, painstakingly going over tons of details in an effort to get it right, and then someone couldn't find it.
Here's what he did - and I replicated the steps and all was clear.
Go to http://www.wikipedia.org/
Make sure the dropdown in the search box is set to 'EN' - which it would have been for him.
Start typing 'Wells Fargo history' and watch as the dropdown selections narrow. You'll have the experience that he had - you'll see the bank article prominently featured and then various buildings (they have a habit of sponsoring sports arenas in various US cities) and finally as you start typing history it focuses in on the History Museum.
If you don't choose any of those, then hit enter, you'll get to the search results page. This is the one with a huge box of options at the top (which will be confusing and frightening to people who aren't already wikipedians) and then by my count the desired article is 13th on the page: [[History of Wells Fargo]].
Now, I strongly suspect this could be fixed by making a redirect from [[Wells Fargo history]] to [[History of Wells Fargo]].
Or a more serious fix could be had if the search engine understood that very very often in English [[X of Y]] can be written [[Y X]]. ([[List of French monarchs]] becomes [[French monarchs list]], see: https://en.wikipedia.org/wiki/Special:Search?search=french+monarchs+list where the desired article is in 10th place.
But my point is not to argue for any specific fix. My point is to illustrate that there is a real problem with search, that it is impacting users, and that we should invest in fixing it.
--Jimbo
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
wikimedia-l@lists.wikimedia.org