> in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of >the top result shown below the address bar. 
Ah, I see, you mean searches happening "outside" the application just through the search bar. To avoid confusion let's call those "eagerly loading" a page (to distinguish them from the link directives for prefetching). If those are happening: would they be counted as pageviews? Yes.  Now, are they happening? That is a harder question to answer cause in the absence of  a header you cannot distinguish those requests from everything else. In any case, they do not happen deterministically for everyone cause they rely on the prediction system in chrome. See chrome://predictors/ [1] The fact that you have to enable them on your browser  tells you usage is minimal so is this a concern for pageview numbers? I do not think so. 

>According to https://stackoverflow.com/a/9852667/4746236
>Some browsers do send a header that could be detected for these prefetches.
Please note that the answer for the question is for 2012 and it mixes "pre-fetches [of] content (at the behest of the referrer page’s markup)"  (which are link directives)  [2] and "eagerly loading" of  a page (done through "network action predictions") [1] 

[1] https://www.igvita.com/posa/high-performance-networking-in-google-chrome/#predictor
[2] https://developer.mozilla.org/en-US/docs/Web/HTTP/Link_prefetching_FAQ


On Thu, Dec 20, 2018 at 6:41 AM Isaac Johnson <isaac@wikimedia.org> wrote:
Yes, I checked Chrome/Mozilla as well and saw no evidence of prefetching for Wikipedia pages.

Related point: it does appear that for media counts, that prefetching by the Media Viewer might change the counts: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Mediacounts#Corner_cases

On Thu, Dec 20, 2018 at 4:19 AM Addshore <addshorewiki@gmail.com> wrote:
I just quickly tried this out live looking for a prefetch webrequest from my chrome browser for a enwiki, enwiktionary and wikidata page but no prefetching happened.

After a quick search I found https://www.technipages.com/google-chrome-prefetch and confirmed that the prefetch setting is ON in my browser.

Maybe prefetching doesn't happen for our sites? Or maybe something else in my environment makes Chrome decide that it doesn't need to?

Some browsers do send a header that could be detected for these prefetches.
But apparently Chrome is no longer one of those browsers, https://bugs.chromium.org/p/chromium/issues/detail?id=86175 


On Thu, 20 Dec 2018 at 02:58, Timo Tijhof <ttijhof@wikimedia.org> wrote:
Ugh, this wasn't meant to go on-list (obviously). Sorry!

-- Timo

On Wed, 19 Dec 2018 at 18:55, Timo Tijhof <ttijhof@wikimedia.org> wrote:
Hi Nuria,

As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.

I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)

-- Timo



On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz <nuria@wikimedia.org> wrote:
> I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or 
> its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) 
The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time. 

>By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page.
This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump. 

>Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.

Thanks,

Nuria


On Wed, Dec 19, 2018 at 9:43 AM Adam Baso <abaso@wikimedia.org> wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?

Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?

Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).

-Adam



On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans <fdans@wikimedia.org> wrote:
Hi Chenqi,

You can find out more about what constitutes a pageview in its Metawiki article:


As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!

Fran

On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu <czhu@stern.nyu.edu> wrote:
Hi everyone,

I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.

If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.

Thanks in advance for any insight. 

 
Chenqi Zhu
New York University
44 W 4th St., Suite 10-185(B),
New York, NY 10012, U.S.A.
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


--
Francisco Dans
Software Engineer, Analytics Team
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


--
Isaac Johnson -- Research Scientist -- Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics