Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A.
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A. _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A. _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
I think that's for the Page Previews feature (i.e., when a user hovers
over a link on desktop Wikipedia) or
its corresponding feature in the the Wikipedia for Android (triggered by
default on link tap) The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time.
By "prefetching", I meant X's Wikipedia page shows up in the search
results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page. This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump.
Are you saying that browser-based prefetch activity (e.g., with resource
hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way? No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.
Thanks,
Nuria
On Wed, Dec 19, 2018 at 9:43 AM Adam Baso abaso@wikimedia.org wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A. _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Nuria,
As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.
I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)
-- Timo
On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz nuria@wikimedia.org wrote:
I think that's for the Page Previews feature (i.e., when a user hovers
over a link on desktop Wikipedia) or
its corresponding feature in the the Wikipedia for Android (triggered by
default on link tap) The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time.
By "prefetching", I meant X's Wikipedia page shows up in the search
results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page. This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump.
Are you saying that browser-based prefetch activity (e.g., with resource
hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way? No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.
Thanks,
Nuria
On Wed, Dec 19, 2018 at 9:43 AM Adam Baso abaso@wikimedia.org wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A. _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Ugh, this wasn't meant to go on-list (obviously). Sorry!
-- Timo
On Wed, 19 Dec 2018 at 18:55, Timo Tijhof ttijhof@wikimedia.org wrote:
Hi Nuria,
As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.
I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)
-- Timo
On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz nuria@wikimedia.org wrote:
I think that's for the Page Previews feature (i.e., when a user hovers
over a link on desktop Wikipedia) or
its corresponding feature in the the Wikipedia for Android (triggered
by default on link tap) The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time.
By "prefetching", I meant X's Wikipedia page shows up in the search
results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page. This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump.
Are you saying that browser-based prefetch activity (e.g., with resource
hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way? No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.
Thanks,
Nuria
On Wed, Dec 19, 2018 at 9:43 AM Adam Baso abaso@wikimedia.org wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A. _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
I just quickly tried this out live looking for a prefetch webrequest from my chrome browser for a enwiki, enwiktionary and wikidata page but no prefetching happened.
After a quick search I found https://www.technipages.com/google-chrome-prefetch and confirmed that the prefetch setting is ON in my browser.
Maybe prefetching doesn't happen for our sites? Or maybe something else in my environment makes Chrome decide that it doesn't need to?
According to https://stackoverflow.com/a/9852667/4746236 Some browsers do send a header that could be detected for these prefetches. But apparently Chrome is no longer one of those browsers, https://bugs.chromium.org/p/chromium/issues/detail?id=86175
On Thu, 20 Dec 2018 at 02:58, Timo Tijhof ttijhof@wikimedia.org wrote:
Ugh, this wasn't meant to go on-list (obviously). Sorry!
-- Timo
On Wed, 19 Dec 2018 at 18:55, Timo Tijhof ttijhof@wikimedia.org wrote:
Hi Nuria,
As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.
I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)
-- Timo
On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz nuria@wikimedia.org wrote:
I think that's for the Page Previews feature (i.e., when a user hovers
over a link on desktop Wikipedia) or
its corresponding feature in the the Wikipedia for Android (triggered
by default on link tap) The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time.
By "prefetching", I meant X's Wikipedia page shows up in the search
results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page. This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump.
Are you saying that browser-based prefetch activity (e.g., with
resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way? No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.
Thanks,
Nuria
On Wed, Dec 19, 2018 at 9:43 AM Adam Baso abaso@wikimedia.org wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
Hi everyone,
I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.
If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.
Thanks in advance for any insight.
Chenqi Zhu New York University 44 W 4th St., Suite 10-185(B), New York, NY 10012, U.S.A. _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Yes, I checked Chrome/Mozilla as well and saw no evidence of prefetching for Wikipedia pages.
Related point: it does appear that for media counts, that prefetching by the Media Viewer might change the counts: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Mediacounts#...
On Thu, Dec 20, 2018 at 4:19 AM Addshore addshorewiki@gmail.com wrote:
I just quickly tried this out live looking for a prefetch webrequest from my chrome browser for a enwiki, enwiktionary and wikidata page but no prefetching happened.
After a quick search I found https://www.technipages.com/google-chrome-prefetch and confirmed that the prefetch setting is ON in my browser.
Maybe prefetching doesn't happen for our sites? Or maybe something else in my environment makes Chrome decide that it doesn't need to?
According to https://stackoverflow.com/a/9852667/4746236 Some browsers do send a header that could be detected for these prefetches. But apparently Chrome is no longer one of those browsers, https://bugs.chromium.org/p/chromium/issues/detail?id=86175
On Thu, 20 Dec 2018 at 02:58, Timo Tijhof ttijhof@wikimedia.org wrote:
Ugh, this wasn't meant to go on-list (obviously). Sorry!
-- Timo
On Wed, 19 Dec 2018 at 18:55, Timo Tijhof ttijhof@wikimedia.org wrote:
Hi Nuria,
As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.
I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)
-- Timo
On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz nuria@wikimedia.org wrote:
I think that's for the Page Previews feature (i.e., when a user
hovers over a link on desktop Wikipedia) or
its corresponding feature in the the Wikipedia for Android (triggered
by default on link tap) The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time.
By "prefetching", I meant X's Wikipedia page shows up in the search
results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page. This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump.
Are you saying that browser-based prefetch activity (e.g., with
resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way? No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.
Thanks,
Nuria
On Wed, Dec 19, 2018 at 9:43 AM Adam Baso abaso@wikimedia.org wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
Hi Chenqi,
You can find out more about what constitutes a pageview in its Metawiki article:
https://meta.wikimedia.org/wiki/Research:Page_view#Definition
As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!
Fran
On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu wrote:
> Hi everyone, > > I am trying to better understand the pageview data. I have a quick > question. I apologize if the question has been asked or it is so naive. > > If the web browser prefetches a Wikipedia page, does it count as one > pageview in the pageview data? By "prefetching", I meant X's Wikipedia page > shows up in the search results and the browser prefetches/preloads the > search results but I do not click on X's Wikipedia page. If so, the > pageview data seem to over-count the number of visits to X's Wikipedia page. > > Thanks in advance for any insight. > > > Chenqi Zhu > New York University > 44 W 4th St., Suite 10-185(B), > New York, NY 10012, U.S.A. > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics >
-- *Francisco Dans* Software Engineer, Analytics Team Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
in Safari and Chrome with the default settings, there is a native browser
feature that, when searching through the address bar (Google powered) by default silently starts loading the url of >the top result shown below the address bar. Ah, I see, you mean searches happening "outside" the application just through the search bar. To avoid confusion let's call those "eagerly loading" a page (to distinguish them from the link directives for prefetching). If those are happening: would they be counted as pageviews? Yes. Now, are they happening? That is a harder question to answer cause in the absence of a header you cannot distinguish those requests from everything else. In any case, they do not happen deterministically for everyone cause they rely on the prediction system in chrome. See chrome://predictors/ [1] The fact that you have to enable them on your browser tells you usage is minimal so is this a concern for pageview numbers? I do not think so.
According to https://stackoverflow.com/a/9852667/4746236 Some browsers do send a header that could be detected for these prefetches.
Please note that the answer for the question is for 2012 and it mixes "pre-fetches [of] content (at the behest of the referrer page’s markup)" (which are link directives) [2] and "eagerly loading" of a page (done through "network action predictions") [1]
[1] https://www.igvita.com/posa/high-performance-networking-in-google-chrome/#pr... [2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Link_prefetching_FAQ
On Thu, Dec 20, 2018 at 6:41 AM Isaac Johnson isaac@wikimedia.org wrote:
Yes, I checked Chrome/Mozilla as well and saw no evidence of prefetching for Wikipedia pages.
Related point: it does appear that for media counts, that prefetching by the Media Viewer might change the counts: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Mediacounts#...
On Thu, Dec 20, 2018 at 4:19 AM Addshore addshorewiki@gmail.com wrote:
I just quickly tried this out live looking for a prefetch webrequest from my chrome browser for a enwiki, enwiktionary and wikidata page but no prefetching happened.
After a quick search I found https://www.technipages.com/google-chrome-prefetch and confirmed that the prefetch setting is ON in my browser.
Maybe prefetching doesn't happen for our sites? Or maybe something else in my environment makes Chrome decide that it doesn't need to?
According to https://stackoverflow.com/a/9852667/4746236 Some browsers do send a header that could be detected for these prefetches. But apparently Chrome is no longer one of those browsers, https://bugs.chromium.org/p/chromium/issues/detail?id=86175
On Thu, 20 Dec 2018 at 02:58, Timo Tijhof ttijhof@wikimedia.org wrote:
Ugh, this wasn't meant to go on-list (obviously). Sorry!
-- Timo
On Wed, 19 Dec 2018 at 18:55, Timo Tijhof ttijhof@wikimedia.org wrote:
Hi Nuria,
As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.
I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)
-- Timo
On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz nuria@wikimedia.org wrote:
I think that's for the Page Previews feature (i.e., when a user
hovers over a link on desktop Wikipedia) or
its corresponding feature in the the Wikipedia for Android
(triggered by default on link tap) The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time.
By "prefetching", I meant X's Wikipedia page shows up in the search
results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page. This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump.
Are you saying that browser-based prefetch activity (e.g., with
resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way? No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.
Thanks,
Nuria
On Wed, Dec 19, 2018 at 9:43 AM Adam Baso abaso@wikimedia.org wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?
Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9 , https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).
-Adam
On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans fdans@wikimedia.org wrote:
> Hi Chenqi, > > You can find out more about what constitutes a pageview in its > Metawiki article: > > https://meta.wikimedia.org/wiki/Research:Page_view#Definition > > As you can see, one of the conditions is whether the request being > examined is a preview or not, in which case it is not counted as a page > view. Hope this helps! > > Fran > > On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu czhu@stern.nyu.edu > wrote: > >> Hi everyone, >> >> I am trying to better understand the pageview data. I have a quick >> question. I apologize if the question has been asked or it is so naive. >> >> If the web browser prefetches a Wikipedia page, does it count as >> one pageview in the pageview data? By "prefetching", I meant X's Wikipedia >> page shows up in the search results and the browser prefetches/preloads the >> search results but I do not click on X's Wikipedia page. If so, the >> pageview data seem to over-count the number of visits to X's Wikipedia page. >> >> Thanks in advance for any insight. >> >> >> Chenqi Zhu >> New York University >> 44 W 4th St., Suite 10-185(B), >> New York, NY 10012, U.S.A. >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > -- > *Francisco Dans* > Software Engineer, Analytics Team > Wikimedia Foundation > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
-- Isaac Johnson -- Research Scientist -- Wikimedia Foundation _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics