Hello Giovanni,
thanks for the pointer to the Click datasets.
I'd have to take a look at the complete dataset, to see how much of
those requests are touching wikipedia.
Then, one of the requirements to access those datas is:
"The Click Dataset is large (~2.5 TB compressed), which requires that
it be transferred on a physical hard drive. You will have to provide the
drive as well as pre-paid return shipment. "
I have to check if this is possible and how long this might take to ship
and send back an hard-drive from Switzerland.
I'll let you know !!
Best,
Valerio
On Wed, Sep 17, 2014 at 4:09 PM, Giovanni Luca Ciampaglia <
gciampag(a)indiana.edu> wrote:
Valerio,
I didn't know such data existed. As an alternative, perhaps you could
have a look at our click datasets, which contain requests to the Web at
large (i.e., not just Wikipedia) generated from within the campus of
Indiana University over a period of several months. HTH
http://carl.cs.indiana.edu/data/#click
Cheers
G
Giovanni Luca Ciampaglia
✎ 919 E 10th ∙ Bloomington 47408 IN ∙ USA
☞
http://www.glciampaglia.com/
✆ +1 812 855-7261
✉ gciampag(a)indiana.edu
2014-09-17 9:53 GMT-04:00 Valerio Schiavoni <
valerio.schiavoni(a)gmail.com>gt;:
> Hello,
> just bumping my email from last week, since so far I did not get any
> answer.
>
> Should I consider that dataset to be somehow lost ?
>
> I've also contacted the researchers who partially released it, but
> making it publicly available is tricky for them, due to its size (12 TB),
> which might instead be somehow in the norms of the operations taken daily
> by Wikipedia servers.
>
> Thanks again,
> Valerio
>
>>
>> On Wed, Sep 10, 2014 at 4:15 AM, Valerio Schiavoni <
>> valerio.schiavoni(a)gmail.com> wrote:
>>
>>> Dear WikiMedia foundation,
>>> in the context of a EU research project [1], we are interested in
>>> accessing
>>> wikipedia access traces.
>>> In the past, such traces were given for research purposes to other
>>> groups
>>> [2].
>>> Unfortunately, only a small percentage (10%) of that trace has been
>>> made
>>> made available (10%).
>>> We are interested in accessing the totality of that same trace (or
>>> even
>>> better, a more recent one, but the same one will do).
>>>
>>> If this is not the correct ML to use for such requests, could please
>>> anyone
>>> redirect me to correct one ?
>>>
>>> Thanks again for your attention,
>>>
>>> Valerio Schiavoni
>>> Post-Doc Researcher
>>> University of Neuchatel, Switzerland
>>>
>>> 1 -
http://www.leads-project.eu
>>> 2 -
http://www.wikibench.eu/?page_id=60
>>>
>>
>>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l