On Thu, Apr 22, 2010 at 6:31 PM, Platonides <Platonides@gmail.com> wrote:
S. Nunes wrote:
> Hi all,
>
> I presume that Wikipedia keeps data about HTTP accesses to all articles.
> Can anybody inform me if this data is available for research purposes?

No. With the amount of traffic it has, space needs would be immense, and
Wikimedia is not interested in logging all accesses.

What kind of space needs are we talking about?  I find it hard to imagine that the other top 10 websites aren't keeping this information.  Shouldn't you be logging every access, at least for a few days, in case of some sort of security breach?
 
> Access to this information poses
> no risk to users' privacy since no user information is made available
> - sessions' id, hour/minute timestamp data and IPs could be easily
> discarded.


What if your referer was your facebook personal page leaking your full
real name?

And what if you're in the sample?  I find it quite inappropriate that even sampled data like this is being released.