IR-Cache provide their traces on less than a second granularity. They have
been doing that for years. The way they deal with the storage problem is by
having a rotating log with maximum one week, so when they will add a new
file for today, they will delete the one for Monday last week. Anyone
requiring to use data of more than one week needs to write his own script
or download the files at least once a week.
Should Wikimedia provide such data, there shouldn't be a storage problem.
On Mon, Sep 22, 2014 at 7:13 AM, Pine W <wiki.pine(a)gmail.com> wrote:
Hm, on the second point the person to ask is Toby, but
it sounds like
there are reasons for the minimun one hour granulatity, and with Oliver's
point it sounds like this research approach won't produce the intended
benefits anyway. Perhaps another reason for one hour minimum granulatity is
because of the storage and other resource requirements for highly granular
data are too expensive to justify the benefits.
Wiki-research-l mailing list