The list of articles was decided here phabricator.wikimedia.org/T120504
There's a page for a few categories, a very short article (Campus honeymoon), a long one (Barack Obama).
As I mentioned in the task it is trivial to run the report with a different sample set and I'm happy to if somebody has the interest to visualize a different set of articles.
Adam for example in the task posted different sets of articles based on other criteria like page views (https://phabricator.wikimedia.org/T120504#1900287), it'd be interesting to run those and see if the trends on navbox and reference sizes hold up.
We're also thinking about running a similar report across a bigger dataset in a more aggregated way, maybe the top 100.000 articles on pageviews with the sizes weighted with the pageviews number to get a more global understanding, but we haven't gotten around to it yet (it would be a new more global, less per-page one).
Joaquin Oltra Hernandez, 20/01/2016 16:55:
We've added Mediawiki parser content analysis to the content analysis
report that the Reading web team performed last quarter.
Thanks. It would be useful to understand what your dataset is: I see 9 page titles, presumably fetched from the English Wikipedia. Is this your dataset? How did you ensure it's representative of what users see?
Nemo