<quote who="Valerio Schiavoni" date="Wed, Sep 24, 2014 at 12:09:44PM
I'm sorry to contradict you, but at least on the
Wikibench traces, that
information is very well present. I see things like:
I'm quite surprised that such informations are not
known by the
community of Wikipedia researchers.
Well, my ignorance is my own and does not reflect the community of
Wikipedia researchers. :)
But, as Scott pointed out, I was referring to pagecount data published
by WMF (i.e., the data binned by hour that we were discussing in the
I was replying to the discussion about the granularity of the
pagecount data to point out that increased granularity won't help you
because the data you want isn't provided in /that/ dataset at all.
Wikibench is the only source of data I know of that includes hits to
the "/w/index.php" pages for all of Wikipedia (I'd love to hear that
I'm wrong about that). Unfortunately, Wikibench was, as far as I know,
basically a one-off thing. It's great if you want a 10% sample of this
kind of data for a ~3.5 months period in late 2007. If you want
anything that is less stale, I think you're going to have to try to
cut a deal with WMF to collect it.
Benjamin Mako Hill
Creativity can be a social contribution, but only in so far
as society is free to use the results. --GNU Manifesto