> Do you think it's worth getting the UA distribution for CSS requests & correlate it with the distribution for page / JS loading?
Yes, we can do that. I would need to gather a new dataset for it so I've made a new task for it (https://phabricator.wikimedia.org/T89847), marking this one as complete: https://phabricator.wikimedia.org/T88560


I also like to do some research regarding IE6 /IE7 as we should see those (according to our code: https://github.com/wikimedia/mediawiki/blob/master/resources/src/startup.js) in the no JS list but we only see some UA agents there. There are definitely IE6/IE7 browsers to which we are serving javascript, just have to look in detail what is what we are serving there. Will report on this. Looks like this startup.js file is being served to all browsers regardless, so I might need to do some more fine grained queries.

Just consider the 3% as your approximate upper bound for overall traffic, big bots removed. If you just count mobile traffic, numbers in percentage are, of course, a lot higher.

Thanks, 

Nuria



On Wed, Feb 18, 2015 at 8:01 AM, Gabriel Wicke <gwicke@wikimedia.org> wrote:
Nuria,

your explanation for the increase of the no-JS numbers makes a lot of sense to me. This is very valuable information, as it contradicts the assumption that JS support kept going up in the meantime. Thank you!

The main thing I'm slightly worried about with relatively small total numbers is that even one or two bots masquerading as old browser UAs could skew the results for those UAs. This matters especially once we are trying to establish trends based on this early measurement. Using additional behavioral factors like image (or, perhaps better, CSS) loading might help to more precisely weed out non-browser users, which could benefit our UA detection precision in general. Do you think it's worth getting the UA distribution for CSS requests & correlate it with the distribution for page / JS loading?

Gabriel

On Wed, Feb 18, 2015 at 7:17 AM, Nuria Ruiz <nuria@wikimedia.org> wrote:
Sorry I forgot to address this earlier:

>Do you think there are ways to fine-tune this, perhaps by excluding clients that also didn't download images?
We can look at that but I suspect results will not differ much. Let me know if you think is necessary.

On Mon, Feb 16, 2015 at 9:23 PM, Nuria Ruiz <nuria@wikimedia.org> wrote:
>That number is higher than I expected given that the general web was apparently closer to 1.3% in 2010.
mmm.. that study looks too old to be relevant, two things on that:

1) Numbers from 2010 do not include mobile browsers with widespread use nowadays. 
For example: "Opera Mini". We have >1% of requests only from this browser (and I bet than in 2015 Yahoo is  seeing quite a few of those). Note that this 1% is a more precise one, derived directly from hadoop logs, requires no guesswork. 

So it is not surprising that the number of disabled javascript pageviews has gone up if you take mobile into account. Opera Mini does not support javascript in the ways you would expect: https://dev.opera.com/articles/opera-mini-and-javascript/


2) Our data differs from global stats in significant ways. 
For example, our IE6 and IE7 traffic is way higher than global stats reported by http://gs.statcounter.com/ on the month of January. And note these browser percentages are more precise estimates on our end (unlike the javascript estimate that requires some cross checking and guesswork). Also, note the total percentage we report over pageviews includes bots so excluding those our IE6 and IE7 traffic is even higher than the one I am noting below.

Browser, Percentage of total pageviews by our account, global percentage by statscounter
IE6: 1.01%, 0.09%
IE7: 0.7% , 0.14%

I do not expect that our numbers are going to match 100% to statscounter but I think is an OK guide to cross-check oneself, especially cause they deploy their beacons worldwide:http://gs.statcounter.com/faq#methodology


>Finally, is there a way to gauge the difference in JS support between anonymous & authenticated users from this data?
No, I do not think we can do that with this dataset.

On Mon, Feb 16, 2015 at 7:19 PM, Gabriel Wicke <gwicke@wikimedia.org> wrote:
Thank you, Nuria!

That number is higher than I expected given that the general web was apparently closer to 1.3% in 2010. Do you think there are ways to fine-tune this, perhaps by excluding clients that also didn't download images?

Finally, is there a way to gauge the difference in JS support between anonymous & authenticated users from this data?

Gabriel

On Mon, Feb 16, 2015 at 6:38 PM, Nuria Ruiz <nuria@wikimedia.org> wrote:
Gabriel:

I have run through the data and have a rough estimate of how many of our pageviews are requested from browsers w/o strong javascript support. It is a preliminary rough estimate but I think is pretty useful.

TL;DR
According to our new pageview definition (https://meta.wikimedia.org/wiki/Research:Page_view) about 10% of pageviews come from clients w/o much javascript support. But - BIG CAVEAT- this includes bots requests. If you remove the easy-too-spot-big-bots the percentage is <3%. 

Details here (still some homework to do regarding IE6 and IE7) 


Thanks, 

Nuria