Analytics

analytics@lists.wikimedia.org

2 participants
1825 discussions

Conceptual differences between «Pageview» «Pageview complete» dumps
by Ismael Olea 17 Jan '23

17 Jan '23

Hi: I can't find any (official?) definition about the nature and practical use between the «Pageview» «Pageview complete» dumps other than a brief description at Dumps[1]. Searching for pageview complete there is not results at: - Wikitech: https://w.wiki/6EDz - Meta: https://w.wiki/6EDx, except the link to Dumps[2] Maybe is it because the complete ones are by design related to Wikipedia Administrative Pages Analytics[3]? Or because it included data before the latest pageview definition (2015)? Maybe both? [1] https://dumps.wikimedia.org/other/analytics/ [2] https://dumps.wikimedia.org/other/pageview_complete/readme.html [3] https://meta.wikimedia.org/wiki/Wikipedia_Administrative_Pages_Analytics -- Ismael Olea http://olea.org/diario/

2 3

About «The date(s) you used are valid, but we either do not have data for those date(s)»
by Ismael Olea 17 Jan '23

17 Jan '23

Hi: As it is the first time I'm working in Wikimedia analytics I found a case that was weird to me. In some cases I can't get data from the API. - en.wikivoyage.org - Culturally significant landscapes in Jaén - 2022121700 - API call: https://w.wiki/6DjC I got the «The date(s) you used are valid, but we either do not have data for those date(s)» message, which looks strange to me because the resource exists as can be checked: - 2022121600 - API call: https://w.wiki/6DjE If there is no visit for 2022121700 I would have expected a correct response with value=0. Is this the expected behavior or I have found a glitch? I found a few other cases, so I prefer to ask here. Thanks. -- Ismael Olea http://olea.org/diario/

2 2

Quality analytics
by Ismael Olea 16 Jan '23

16 Jan '23

Hi: Do we have tools, metrics or traces about the evolution of quality in articles? Or something like that. Not sure if the ORES technology is appropriate for it. -- Ismael Olea http://olea.org/diario/

2 1

[Wikimedia Research Showcase] January 18
by Emily Lescak 12 Jan '23

12 Jan '23

Hello everyone, The next Research Showcase, focused on Editor Retention, will be live-streamed Wednesday, January 18. Find your local time here <https://zonestamp.toolforge.org/1674063059>. YouTube stream: https://www.youtube.com/watch?v=gS8ELcVZ8Q4 You can join the conversation on IRC at #wikimedia-research. You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase This month's presentations: Vital Signsː Measuring Wikipedia Communities’ HealthBy *Cristian Consonni, Eurecat - Centre Tecnològic de Catalunya, Barcelona*Community health in Wikipedia is a complex topic that has been at the center of discussion for Wikipedia and the scientific community for years. Researchers observed that the number of active editors for the largest Wikipedias started declining after an initial phase of exponential growth. Some media outlets picked this fact as a death announcement for the project, but the news of Wikipedia's death turned out to be greatly exaggerated. However, it remains true that researchers and community activists need to understand how to measure community health and describe it more accurately. In this presentation, we would like to go beyond the traditional metrics used to describe the status of the community. We propose the creation of 6 sets of language-independent indicators that we call "Vital Signs." We borrow the analogy from the medical field, as these indicators represent a first step in defining the health status of a community; they can constitute a valuable reference point to foresee and prevent future risks. We present our analysis for several Wikipedia language editions, showing that communities renew their productive force even with stagnating absolute numbers; we observe a general need for renewal in positions related to particular functions or administratorship. We created a dashboard to visualize all the indicators we have computed and hope that the communities will find it helpful for improving their health. - Paperː Community Vital Signs: Measuring Wikipedia Communities’ Sustainable Growth and Renewal <https://meta.wikimedia.org/wiki/File:Community_Vital_Signs_Research_Paper_-…> Learning to Predict the Departure Dynamics of Wikidata EditorsBy *Guangyuan Piao, Maynooth University*Wikidata as one of the largest open collaborative knowledge bases has drawn much attention from researchers and practitioners since its launch in 2012. As it is collaboratively developed and maintained by a community of a great number of volunteer editors, understanding and predicting the departure dynamics of those editors are crucial but have not been studied extensively in previous works. In this paper, we investigate the synergistic effect of two different types of features: statistical and pattern-based ones with DeepFM as our classification model which has not been explored in a similar context and problem for predicting whether a Wikidata editor will stay or leave the platform. Our experimental results show that using the two sets of features with DeepFM provides the best performance regarding AUROC (0.9561) and F1 score (0.8843), and achieves substantial improvement compared to using either of the sets of features and over a wide range of baselines. - Paperː Learning to Predict the Departure Dynamics of Wikidata Editors <https://parklize.github.io/publications/ISWC2021.pdf> -- Emily Lescak (she / her) Senior Research Community Officer The Wikimedia Foundation

1 0

Missing Pageviews data for Jan 8, 2023
by Ben Smith 10 Jan '23

10 Jan '23

Hello, The Pageviews API <https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedi…> view count data and the Pageviews dump files <https://dumps.wikimedia.org/other/pageviews/2023/2023-01/> seem to be missing for yesterday (Jan 8, 2023). Does anyone know when that data will be made available? Regards, Ben

3 2

Pageviews per country
by Ismael Olea 25 Dec '22

25 Dec '22

Hi: I'm completely new to analytics in Wikimedia. We are working with a heritage institution in a GLAM project and they are interested in access statistics for the resources they have released in Wikimedia. I think I got the point about how the pageviews concept is and how to use it but, as far as I understand, it's not possible to get details like article pageviews, for example, per country. Is this correct? If so, what should be the way to get (or process) the information to produce the data? Also, I'm reading about the resulting format[1] but I can't find the related logs. Any suggestions? Thanks. [1] https://meta.wikimedia.org/wiki/Research:Page_view#Resulting_format -- Ismael Olea http://olea.org/diario/

5 11

[Wikimedia Research Showcase] December 14
by Emily Lescak 09 Dec '22

09 Dec '22

Hi all, The next Research Showcase will be live-streamed next Wednesday, December 14. Find your local time here <https://zonestamp.toolforge.org/1671039024>. The title of the Showcase is, 'A year in review from the WMF Research team: Tying our work to the research community.' The Wikimedia Research community is key to tackling the many strategic challenges of the Wikimedia movement. As we are ending the year, the Research team will reflect on why working with the community is important to us. We will share the initiatives, tools, and resources developed throughout 2022 to bring the community together, facilitate researchers’ contributions to the Wikimedia projects, and encourage a diversity of research questions. YouTube stream: https://www.youtube.com/watch?v=a0ss9ckUlvQ You can join the conversation on IRC at #wikimedia-research. You can also watch our past Showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase Warm regards, Emily -- Emily Lescak (she / her) Senior Research Community Officer The Wikimedia Foundation

1 0

[Wikimedia Research Showcase] November 16
by Emily Lescak 09 Nov '22

09 Nov '22

Hello everyone, The next Research Showcase will be live-streamed Wednesday, November 16, at 9:30 AM PST/16:30 UTC. Find your local time here <https://zonestamp.toolforge.org/1668619830>. YouTube stream: https://www.youtube.com/watch?v=sFanZoHjUnY Members of the Research team will collect questions on IRC at #wikimedia-research and YouTube. This month's theme is 'Libraries and Wikipedia Knowledge.' In the first talk, Laurie Bridges (Oregon State University) and Michael David Miller (McGill University) will co-present on Wikipedia and Academic Libraries. Abstract: In 2021 an open-access edited book, Wikipedia and Academic Libraries: A Global Project <https://doi.org/10.3998/mpub.11778416>, was published, featuring 20 chapters from over 50 authors. In this presentation, Laurie Bridges, one of the co-editors, will discuss the process for creating and publishing an OA-edited book. Michael David Miller, one of the chapter authors, will discuss his chapter about contributions to local Québécois LGBTQ+ content in Francophone Wikipedia. The second talk will be on Ethical Considerations of Including Gender Information in Open Knowledge Platforms, presented by Nerissa Lindsey (San Diego State University). Abstract: In recent years, galleries, libraries, archives, and museums (GLAMs) have sought to leverage open knowledge platforms such as Wikidata to highlight or provide more visibility for traditionally marginalized groups and their work, collections, or contributions. Efforts like Art + Feminism, local edit-a-thons, and, more recently, GLAM institution-led projects have promoted open knowledge initiatives to a broader audience of participants. One such open knowledge project, the Program for Cooperative Cataloging (PCC) Wikidata Pilot, has brought together over seventy GLAM organizations to contribute linked open data for individuals associated with their institutions, collections, or archives. However, these projects have brought up ethical concerns around including potentially sensitive personal demographic information, such as gender identity, sexual orientation, race, and ethnicity, in entries in an open knowledge base about living persons. GLAM institutions are thus in a position of balancing open access with ethical cataloging, which should include adhering to the personal preferences of the individuals whose data is being shared. People working in libraries and archives have been increasingly focusing their energies on issues of diversity, equity, and inclusion in their descriptive practices, including remediating legacy data and addressing biased language. Moving this work into a more public sphere and scaling up in volume creates potential risks to the individuals being described. While adding demographic information on living people to open knowledge bases has the potential to enhance, highlight, and celebrate diversity, it could also potentially be used to the detriment of the subjects through surveillance and targeting activities. In our research we investigated the changing role of metadata and open knowledge in addressing, or not addressing, issues of under- and misrepresentation, especially as they pertain to gender identity as described in the sex or gender property in Wikidata. We reported our findings from a survey investigating how organizations participating in open knowledge projects are addressing ethical concerns around including personal demographic information as part of their projects, including what, if any, policies they have implemented and what implications these activities may have for the living people being described. You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase We hope you can join us! Warm regards, Emily, on behalf of the WMF Research team -- Emily Lescak (she / her) Senior Research Community Officer The Wikimedia Foundation

1 0

Mediacounts fields
by Michele Mauri 04 Nov '22

04 Nov '22

Hello, For an academic research, I'd like to see which are the most viewed images through the "media viewer". Do you know if it’s possible to get this information? I looked on the wikitech portal, but I found just the mediacounts (https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Mediacounts) which is not what I’m looking for. Thank you Michele

4 7

[Wikimedia Research Showcase] October 19
by Emily Lescak 17 Oct '22

17 Oct '22

Hello everyone, The next Research Showcase will be live-streamed Wednesday, October 19, at 9:30 AM PST/16:30 UTC. Find your local time here <https://zonestamp.toolforge.org/1666197004>. YouTube stream: https://www.youtube.com/watch?v=ML-ULyARpU4 Members of the Research team will collect questions on IRC at #wikimedia-research and YouTube. This month's presentation is a panel discussion celebrating Wikidata's 10th birthday! October 2022 marks the tenth anniversary of the launch of Wikidata ( www.wikidata.org). In ten years, this project has become the largest community-driven free knowledge graph in the world, enabling a common knowledge base for Wikimedia projects. The language-independent nature of Wikidata has greatly improved the maintenance and consistency of knowledge across Wikipedia language editions, fostering knowledge equity in Wikimedia. In addition, since Wikidata is a collaborative project that can be read and edited by humans and machines alike, it is also widely used in third-party applications delivering knowledge as a service for all. The Wikimedia Research community has devoted significant effort and resources in studying the foundations, capabilities and applications of Wikidata, from the complex requirements of representing real-world knowledge in a multilingual environment to the needs to assess the quality of data and sources in Wikidata. To learn more about the state of the art of Wikidata and research challenges in the era of AI/ML, we will celebrate this tenth anniversary with a panel that will bring together established researchers/practitioners in this field. The panel will be moderated by Denny Vrandečić (WMF) with panelists Lydia Pintscher (WMDE), Elena Simperl (King's College London), Katherine Thornton (Yale), and Markus Krötzsch (Technical University of Dresden). You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase We hope you can join us! Warm regards, Emily, on behalf of the WMF Research team -- Emily Lescak (she / her) Senior Research Community Officer The Wikimedia Foundation

1 0

← Newer
1
...
4
5
6
7
8
9
10
...
183
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Analytics