Hi all,
Are there any solid estimates out there of how many Google [or other] searches have a Wikipedia article as the first [or second or third...] hit? Any language breakdowns of this would be super cool as well.
I've seen offhand references to this phenomenon in many papers, but I'm wondering if someone on this list knows of a particularly good estimate or reliable information.
thanks, Phoebe
Alexa gives some info on clickstream, but regardless of the search result order (e.g. 23.54% from google).
It also gives where visitors go on Wikipedia.org, which is helpful to get information about language breakdown (58%:En, 7.5:De)
http://www.alexa.com/siteinfo/wikipedia.org
Hope this is helpfuI, although it`s not perfectly what you are looking for.
Regards, *--* *Haitham Shammaa* *Contribution Research Manager* *Wikimedia Foundation*
*Imagine a world in which every single human being can freely share in the sum of all knowledge. * *Click the "edit" button now, and help us make it a reality!*
On Tue, Nov 13, 2012 at 12:47 PM, phoebe ayers phoebe.wiki@gmail.comwrote:
Hi all,
Are there any solid estimates out there of how many Google [or other] searches have a Wikipedia article as the first [or second or third...] hit? Any language breakdowns of this would be super cool as well.
I've seen offhand references to this phenomenon in many papers, but I'm wondering if someone on this list knows of a particularly good estimate or reliable information.
thanks, Phoebe
--
- I use this address for lists; send personal messages to phoebe.ayers
<at> gmail.com *
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Thanks for this pointer, Haitham.
But this reminds me of a longtime concern about Alexa: Where does Alexa get its numbers? As far as I can tell (and I haven't spent more than 2 minutes looking for an answer), it's using a spyware toolbar to do client-side analytics from participating (or just ignorant... ;)) users' browsers, as well as a server-side component collecting data from visitors to those sites that have installed their component. Since neither we nor Google have installed their server-side component, am I correct to infer all they can know about wiki?edia.org domains stems from that group of users who have the Alexa toolbar installed?
If I am, isn't that worse than useless?
A.
On Tue, Nov 13, 2012 at 2:38 PM, Haitham Shammaa hshammaa@wikimedia.orgwrote:
Alexa gives some info on clickstream, but regardless of the search result order (e.g. 23.54% from google).
It also gives where visitors go on Wikipedia.org, which is helpful to get information about language breakdown (58%:En, 7.5:De)
http://www.alexa.com/siteinfo/wikipedia.org
Hope this is helpfuI, although it`s not perfectly what you are looking for.
Regards, *--* *Haitham Shammaa* *Contribution Research Manager* *Wikimedia Foundation*
*Imagine a world in which every single human being can freely share in the sum of all knowledge. * *Click the "edit" button now, and help us make it a reality!*
On Tue, Nov 13, 2012 at 12:47 PM, phoebe ayers phoebe.wiki@gmail.comwrote:
Hi all,
Are there any solid estimates out there of how many Google [or other] searches have a Wikipedia article as the first [or second or third...] hit? Any language breakdowns of this would be super cool as well.
I've seen offhand references to this phenomenon in many papers, but I'm wondering if someone on this list knows of a particularly good estimate or reliable information.
thanks, Phoebe
--
- I use this address for lists; send personal messages to phoebe.ayers
<at> gmail.com *
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi Asaf, Back to your original question, Isn't "google rank" a user dependent parameter? I think depending on the history of clicks and other personalized preferences, how Google presents the results changes from user to user. So the term of "being first or second or ..." google hit sounds something symbolic and non measurable and generalizable to me.
On Wed, Nov 14, 2012 at 12:22 AM, Asaf Bartov abartov@wikimedia.org wrote:
Thanks for this pointer, Haitham.
But this reminds me of a longtime concern about Alexa: Where does Alexa get its numbers? As far as I can tell (and I haven't spent more than 2 minutes looking for an answer), it's using a spyware toolbar to do client-side analytics from participating (or just ignorant... ;)) users' browsers, as well as a server-side component collecting data from visitors to those sites that have installed their component. Since neither we nor Google have installed their server-side component, am I correct to infer all they can know about wiki?edia.org domains stems from that group of users who have the Alexa toolbar installed?
If I am, isn't that worse than useless?
A.
On Tue, Nov 13, 2012 at 2:38 PM, Haitham Shammaa hshammaa@wikimedia.orgwrote:
Alexa gives some info on clickstream, but regardless of the search result order (e.g. 23.54% from google).
It also gives where visitors go on Wikipedia.org, which is helpful to get information about language breakdown (58%:En, 7.5:De)
http://www.alexa.com/siteinfo/wikipedia.org
Hope this is helpfuI, although it`s not perfectly what you are looking for.
Regards, *--* *Haitham Shammaa* *Contribution Research Manager* *Wikimedia Foundation*
*Imagine a world in which every single human being can freely share in the sum of all knowledge. * *Click the "edit" button now, and help us make it a reality!*
On Tue, Nov 13, 2012 at 12:47 PM, phoebe ayers phoebe.wiki@gmail.comwrote:
Hi all,
Are there any solid estimates out there of how many Google [or other] searches have a Wikipedia article as the first [or second or third...] hit? Any language breakdowns of this would be super cool as well.
I've seen offhand references to this phenomenon in many papers, but I'm wondering if someone on this list knows of a particularly good estimate or reliable information.
thanks, Phoebe
--
- I use this address for lists; send personal messages to phoebe.ayers
<at> gmail.com *
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
On Tue, Nov 13, 2012 at 3:41 PM, Taha Yasseri taha.yaseri@gmail.com wrote:
Hi Asaf, Back to your original question, Isn't "google rank" a user dependent parameter? I think depending on the history of clicks and other personalized preferences, how Google presents the results changes from user to user. So the term of "being first or second or ..." google hit sounds something symbolic and non measurable and generalizable to me.
Is it? That hasn't always been true, but I guess it is likely true today. So I guess generalizing from traffic-from-Google does make more sense... I wonder how much it varies for factual queries though, for instance if you are looking up something scientific or that isn't likely to be for sale. (we should run an informal test to see if people on this list get different results!)
Haitham: thanks for the info, though isn't 23% way way too low? I thought our rule of thumb was that 70-80% of the traffic comes from google searches. I'm not sure of the exact number offhand, but I would be very surprised if it was only 23%.
Anyway, I think my original question is a useful one lots of fields/applications, though my immediate interest is for the library literature, which has a habit of breaking down or differentiating "people who start their search with google" and "people who start their search with Wikipedia". I'd like to argue that this is essentially the same thing in many (most) cases, since so many google searches result in Wikipedia results front and center, and we know from lots of other work (SEO and HCI alike) that people rarely go beyond the first few results.
cheers, phoebe
Hello everyone, I have conducted a comparative Search Engine Result Page (SERPs) across 9 Chinese geo-linguistic variants, with the aim to compare the ranking of Baidu Baike versus Chinese Wikipedia. From my limited research and own personal experience, the Google (or Baidu for that matter) does have more or less the same ranking for a geo-linguistic region/variant. Then if you happen to log in with your Google account, your SERPs will deviate from the geo-linguistic norm. Thus, I argue that the difference of SERPs are first determined by the geo-linguistic parameters and then other "personalization" parameters. There is a slight difference with Google using the same geo-linguistic region/variant but from different IP address, but the difference, as I have observed from my data, is negligible if the goal is to compare the general patterns....
As for the ranking, I have constructed a measurement of visibility scores based on click-through-rates (CTR) on the SERP ranking, thereby allowing researchers to convert ranking to a measurement that is much closer to traffic measurement. User-generated encyclopedias are proved to be the most visible for my sample of keywords, especially for those search keywords gathered from the index of Cambridge encyclopedia of China. For other topics, user-generated encyclopedias are still prominent, though not as much. Top three websites are all user-generated encylopedias: http://users.ox.ac.uk/~kebl3178/uge_top3_zh.png
Baidu Baike is much more visible for Baidu of course. Chinese Wikipedia for other geo-linguistic regions. You can have a look at the visualization of the visibility scores here: http://users.ox.ac.uk/~kebl3178/
Finally, the rule of thumb is, if the search query is somehow related to the name or alternative name of an entry, then it is very likely to show up among SERPs. I always hope that other researchers can collaborate on a comparative cross-geo-linguistic project on this, to see whether Wikipedia is visible across the globe, and also for other non-Google search engines such as Baidu and Yandex or Naver.....
Best, han-teng liao
On Thu, Nov 15, 2012 at 1:11 AM, phoebe ayers phoebe.wiki@gmail.com wrote:
On Tue, Nov 13, 2012 at 3:41 PM, Taha Yasseri taha.yaseri@gmail.com wrote:
Hi Asaf, Back to your original question, Isn't "google rank" a user dependent parameter? I think depending on the history of clicks and other personalized preferences, how Google presents the results changes from user to user. So the term of "being first or second or ..." google hit sounds something symbolic and non measurable and generalizable to me.
Is it? That hasn't always been true, but I guess it is likely true today. So I guess generalizing from traffic-from-Google does make more sense... I wonder how much it varies for factual queries though, for instance if you are looking up something scientific or that isn't likely to be for sale. (we should run an informal test to see if people on this list get different results!)
Haitham: thanks for the info, though isn't 23% way way too low? I thought our rule of thumb was that 70-80% of the traffic comes from google searches. I'm not sure of the exact number offhand, but I would be very surprised if it was only 23%.
Anyway, I think my original question is a useful one lots of fields/applications, though my immediate interest is for the library literature, which has a habit of breaking down or differentiating "people who start their search with google" and "people who start their search with Wikipedia". I'd like to argue that this is essentially the same thing in many (most) cases, since so many google searches result in Wikipedia results front and center, and we know from lots of other work (SEO and HCI alike) that people rarely go beyond the first few results.
cheers, phoebe
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi Phoebe (and others on the list),
On 13-11-2012 21:47, phoebe ayers wrote:
Are there any solid estimates out there of how many Google [or other] searches have a Wikipedia article as the first [or second or third...] hit? Any language breakdowns of this would be super cool as well.
If you look in my "Wikipedia research and tools: Review and comments." http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf on page 15 "Popularity" you see a couple of studies using a sample of pages:
"Seeking health information online: does Wikipedia matter?" http://jamia.bmj.com/content/16/4/471.long
http://www.conductor.com/blog/2012/03/wikipedia-in-the-serps-appears-on-page...
http://www.intelligentpositioning.com/blog/2012/02/wikipedia-page-one-of-goo...
The first one reports around 35% health related queries having Wikipedia on top of of the Google result list. http://jamia.bmj.com/content/16/4/471/T1.expansion.html
I've seen offhand references to this phenomenon in many papers, but I'm wondering if someone on this list knows of a particularly good estimate or reliable information.
Google has become 'bubbled'. You could try DuckDuckGo instead, e.g.,
http://duckduckgo.com/?q=Alzheimer+region%3Anone
See also: http://dontbubble.us/
/Finn
Not much left to add after Finn's list, but those may be interesting as well:
https://meta.wikimedia.org/wiki/Research:Newsletter/2011/October#High_search... (In "1000 queries, Yahoo showed the most Wikipedia results within the top 10 lists (446), followed by MSN/Live (387), Google (328), and Ask.com (255)".)
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2011-03-07/In_the... (caused Wikipedia to rise from 7578 to 8050 (+6.2%) presences in the first search result page, in a sample of around 60,000 keywords.)
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2006-11-06/Search... ("Wikipedia appeared in the top 10, thus putting it on the first page of results, on 81% of searches using Google and 77% for Yahoo.")
http://stats.wikimedia.org/wikimedia/squids/SquidReportGoogle.htm ("Google referred to our sites, through its services including search, maps, and Google Earth, 212,902,650 page views per day, representing 41.1% of our external page requests. ")
On Thu, Nov 15, 2012 at 2:59 AM, Finn Årup Nielsen fn@imm.dtu.dk wrote:
Hi Phoebe (and others on the list),
On 13-11-2012 21:47, phoebe ayers wrote:
Are there any solid estimates out there of how many Google [or other]
searches have a Wikipedia article as the first [or second or third...] hit? Any language breakdowns of this would be super cool as well.
If you look in my "Wikipedia research and tools: Review and comments." http://www2.imm.dtu.dk/pubdb/**views/edoc_download.php/6012/** pdf/imm6012.pdfhttp://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6012/pdf/imm6012.pdf on page 15 "Popularity" you see a couple of studies using a sample of pages:
"Seeking health information online: does Wikipedia matter?" http://jamia.bmj.com/content/**16/4/471.longhttp://jamia.bmj.com/content/16/4/471.long
http://www.conductor.com/blog/**2012/03/wikipedia-in-the-** serps-appears-on-page-1-for-**60-of-informational-34-** transactional-queries/http://www.conductor.com/blog/2012/03/wikipedia-in-the-serps-appears-on-page-1-for-60-of-informational-34-transactional-queries/
http://www.**intelligentpositioning.com/**blog/2012/02/wikipedia-page-** one-of-google-uk-for-99-of-**searches/http://www.intelligentpositioning.com/blog/2012/02/wikipedia-page-one-of-google-uk-for-99-of-searches/
The first one reports around 35% health related queries having Wikipedia on top of of the Google result list. http://jamia.bmj.com/content/**16/4/471/T1.expansion.htmlhttp://jamia.bmj.com/content/16/4/471/T1.expansion.html
I've seen offhand references to this phenomenon in many papers, but
I'm wondering if someone on this list knows of a particularly good estimate or reliable information.
Google has become 'bubbled'. You could try DuckDuckGo instead, e.g.,
http://duckduckgo.com/?q=**Alzheimer+region%3Anonehttp://duckduckgo.com/?q=Alzheimer+region%3Anone
See also: http://dontbubble.us/
/Finn
______________________________**_________________ Wiki-research-l mailing list Wiki-research-l@lists.**wikimedia.orgWiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wiki-**research-lhttps://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org