Hi Ismael, responses inline:
On Tue, Dec 20, 2022 at 1:05 PM Ismael Olea <ismael(a)olea.org> wrote:
I'm completely new to analytics in Wikimedia.
Welcome! :)
We are working with a heritage institution in a GLAM project and they are
interested in access statistics for the resources they
have released in
Wikimedia.
Wikimedia CH and Wikimedia Israel have worked on some dashboards showing
GLAM statistics. You may find their projects
<https://glamwikidashboard.org/> interesting. We are currently working
<https://phabricator.wikimedia.org/T325065> with Wikimedia Israel to move
their dashboard to our cloud infrastructure and eventually update our APIs
to better serve them with the data they need. Until then, it may be
interesting to see what statistics they've focused on and how they get them
from the publicly available data we already provide. You can see all this
in their source code:
https://github.com/yonathan06/cassandra-GLAM-tools
I think I got the point about how the pageviews
concept is and how to use
it but, as far as I understand, it's not possible to get details like
article pageviews, for example, per country. Is this correct?
We have an ongoing project to release per-country per-article pageview
information. It's hard for privacy reasons, and we are building a privacy
system that takes all that into account. For now, we have pageviews by
country at a high level
<https://stats.wikimedia.org/#/all-projects/reading/page-views-by-country> and
most viewed articles
<https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews#Most_viewed_articles_per_country>
by country. I'm linking to different parts of our data ecosystem so you
can get familiar with it.
If so, what should be the way to get (or process) the
information to
produce the data?
The only way is to help with the ongoing (and complex) differential privacy
work <https://phabricator.wikimedia.org/T307245>
Also, I'm reading about the resulting format[1] but I can't find the
I can see how that can be misleading. For GLAMs, usually you would want to
download media request statistics
<https://dumps.wikimedia.org/other/mediacounts/daily/>, as the
glamwikidashboard I mentioned above does. (They are currently working on
getting as much as they can from the media requests api
<https://wikitech.wikimedia.org/wiki/Analytics/AQS/Mediarequests>
instead). If you are indeed interested in pageviews, the definition you
linked to talks about the data internally available. Can I ask you to
elaborate a bit more on why you need per-country data?