Hoi, Do I understand well that the 3% of "other" links are the ones that have articles at *this *time but they did not exist at the time of the dump. So in effect they are not red links?
Is there any way to find the articles people were seeking but could not find?? Thanks, GerardM
On 16 January 2018 at 20:21, Leila Zia leila@wikimedia.org wrote:
Hi all,
For archive happiness:
Clickstream dataset is now being generated on a monthly basis for 5 Wikipedia languages (English, Russian, German, Spanish, and Japanese). You can access the data at https://dumps.wikimedia.org/other/clickstream/ and read more about the release and those who contributed to it at https://blog.wikimedia.org/2018/01/16/wikipedia-rabbit-hole-clickstream/
Best, Leila
-- Leila Zia Senior Research Scientist Wikimedia Foundation
On Tue, Feb 17, 2015 at 11:00 AM, Dario Taraborelli < dtaraborelli@wikimedia.org> wrote:
We’re glad to announce the release of an aggregate clickstream dataset extracted from English Wikipedia
http://dx.doi.org/10.6084/m9.figshare.1305770
This dataset contains counts of *(referer, article) *pairs aggregated from the HTTP request logs of English Wikipedia. This snapshot captures
22
million *(referer, article)* pairs from a total of 4 billion requests collected during the month of January 2015.
This data can be used for various purposes: • determining the most frequent links people click on for a given article • determining the most common links people followed to an article • determining how much of the total traffic to an article clicked on a link in that article • generating a Markov chain over English Wikipedia
We created a page on Meta for feedback and discussion about this release: https://meta.wikimedia.org/wiki/Research_talk:Wikipedia_clickstream
Ellery and Dario
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l