I suppose you could get more granular data by conducting an opt-in study of some kind, and you would need to be careful that users who haven't opted in are not accidentally included or indirectly have their privacy affected. I agree that collection at intervals shorter than an hour is going to raise a lot of privacy considerations for users who have not opted in.
Pine
On Thu, Sep 18, 2014 at 12:03 PM, Benj. Mako Hill mako@atdot.cc wrote:
<quote who="Valerio Schiavoni" date="Wed, Sep 17, 2014 at 04:14:04PM +0200">
Unfortunately, no. Those logs only provide page counts but without the associated timestamps ("when" those pages have been accessed). If such
logs
exist, they would perfectly do..
The pagecount data /has/ timing data but they are "binned" by the hour.
I don't think more comprehensive data (all pages, all languages, nearly all viewers) over a long period of time exists anywhere and I don't think any similarly comprehensive data exists before 2007 at all.
You might find more granular data for short periods of time (like the WikiBench data or maybe stuff that's been collected more recently by WMF but isn't published) or much more detailed data from longer periods of time for a subset of users on a particular network (perhaps like the Indiana data, or toolbar data like the Yahoo data that some WP researchers have used).
I would /love/ to hear that I am wrong about this and that there's some wonderful, granual, broad, long-term dataset of pageviews I just don't know about it. :)
Later, Mako
-- Benjamin Mako Hill http://mako.cc/
Creativity can be a social contribution, but only in so far as society is free to use the results. --GNU Manifesto
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l