Hi Nuria,

As I understand it, in Safari and Chrome with the default settings, there is a native browser feature that, when searching through the address bar (Google powered) by default silently starts loading the url of the top result shown below the address bar. Maybe there's a way we opted out, but I think it applies to Wikipedia as well.

I'm replying privately because I didn't understand the last part of your email, and maybe we are saying the same thing :)

-- Timo



On Wed, 19 Dec 2018 at 14:14, Nuria Ruiz <nuria@wikimedia.org> wrote:
> I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or 
> its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) 
The code that Fran pointed to only discounts "previews" by Android app as we stablished that convention a while back. Page previews (hovers over wikipedia links that display a short popup) are not counted as pageviews at all at this time. 

>By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the >pageview data seem to over-count the number of visits to X's Wikipedia page.
This functionality needs to be implemented by the client (it is not automagically implemented by the browser) and it is not implemented on Wikipedia. Searches trigger requests to the API, that return pageview urls, not pageview prefetches. You can follow these workflows on the dev tools of the browser you might be using. chrome://net-export/ is a new addition to the toolset that gives you a readable dump. 

>Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?
No. Browser prefetches cannot be tagged, they are initiated by the browser. Wikipedia's pages do not do any prefetches and or pre-renderings of content using those directives as far as I can see. dns-prefetches are done for two domains: login and meta, neither of which counts as a pageview cause a dns prefetch does not receive an http response. Just instantiates a connection and resolves TLS if any. Prefetches might be indicated by a standard set of headers like "Link:" in the future (this would be initiated by the browser) but that seems on the works.

Thanks,

Nuria


On Wed, Dec 19, 2018 at 9:43 AM Adam Baso <abaso@wikimedia.org> wrote:
Fran, the preview to which you refer, I think that's for the Page Previews feature (i.e., when a user hovers over a link on desktop Wikipedia) or its corresponding feature in the the Wikipedia for Android (triggered by default on link tap) and Wikipedia for iOS (force press) apps, is that right? Are you saying that browser-based prefetch activity (e.g., with resource hinting like https://www.w3.org/TR/resource-hints/ ) is also tagged the same way?

Chenqi Zhu, I think what you're suggesting is the possibility that browsers might be issuing HTTP prefetches for Wikimedia-hosted pages and that could inflate pageviews. I'm not sure, but have you happened to observe user agents making prefetches when resource hinting ( https://www.w3.org/TR/resource-hints/ ) is absent? I'm not sure how often, if at all, discovery platforms like search engines are actually placing resource hints into markup (which is mostly deterministic as far as browser behavior) for Wikimedia content...nor to what degree there might be heuristics being used for prefetching independently of any resource hints. Do you have any data or field observations to help clarify?

Browser settings allude to this sort of behavior (e.g., in Chrome, there's "Use a prediction service to load pages more quickly", which is described at https://support.google.com/chrome/answer/114836 ), although I think without digging into browser source code it's a bit hard to know for certain what's going on "under the hood". We do use preconnect and prefetch semantics and the like in different contexts (cf. https://phabricator.wikimedia.org/search/query/G2tr8i0YZii9https://phabricator.wikimedia.org/search/query/.dtx_hqaj3wj , ...there may be more).

-Adam



On Wed, Dec 19, 2018 at 11:00 AM Francisco Dans <fdans@wikimedia.org> wrote:
Hi Chenqi,

You can find out more about what constitutes a pageview in its Metawiki article:


As you can see, one of the conditions is whether the request being examined is a preview or not, in which case it is not counted as a page view. Hope this helps!

Fran

On Wed, Dec 19, 2018 at 5:30 PM Chenqi Zhu <czhu@stern.nyu.edu> wrote:
Hi everyone,

I am trying to better understand the pageview data. I have a quick question. I apologize if the question has been asked or it is so naive.

If the web browser prefetches a Wikipedia page, does it count as one pageview in the pageview data? By "prefetching", I meant X's Wikipedia page shows up in the search results and the browser prefetches/preloads the search results but I do not click on X's Wikipedia page. If so, the pageview data seem to over-count the number of visits to X's Wikipedia page.

Thanks in advance for any insight. 

 
Chenqi Zhu
New York University
44 W 4th St., Suite 10-185(B),
New York, NY 10012, U.S.A.
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


--
Francisco Dans
Software Engineer, Analytics Team
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics