As part of Wikimedia Germany's work around reference reuse, we wrote a
tool which processes the HTML dumps of all articles and produces
detailed information about how Cite references (and Kartographer maps)
are used on each page.
I'm writing this list for advice on how to publish the results so that
the data can be easily discovered and consumed by researchers.
Currently, the data is contained in 3,100 JSON and NDJSON files hosted
on a Wikimedia Cloud VPS server, with a total size of 3.4GB. The
outputs can be split or merged into whatever form will make them more
useable.
For an overview of the columns and sample rows, please see this task:
https://phabricator.wikimedia.org/T341751
We plan to run the scraper again in the future, and its modular
architecture makes it simple to include or exclude additional
information if anyone has suggestions about what else we might want to
extract from rendered articles. To read more about the tool itself and
why we decided to process HTML dumps directly, see this post:
https://mw.ludd.net/wiki/Elixir/HTML_dump_scraper
-Adam Wight
[[mw:Adamw]]
https://meta.wikimedia.org/wiki/WMDE_Technical_Wishes
Tidskrift för ABM invites submissions for a thematic issue on information
futures. This special issue aims to explore and examine the transformative
role of digital technologies in shaping the landscape of cultural heritage,
knowledge access and information futures.
Topics of interest include, but are not limited to:
* The future of access to knowledge: Trends, challenges, and opportunities.
* AI and machine learning applications in preserving and disseminating
cultural heritage.
* Digital technologies and their impact on knowledge discovery and access.
* Ethical considerations in the use of AI for information retrieval and
curation.
* User experience and human-computer interaction in accessing knowledge in
a digital world.
* The role of libraries, archives, and museums in facilitating knowledge
access in the digital era.
* Social and cultural implications of the evolving information landscape.
We warmly welcome submissions outside of the specific theme as well.
Tidskrift för ABM publishes scientific papers, travelogues, reviews and
notes. We accept submissions in both Swedish and English.
DEADLINE 1ST OF OCTOBER.
Find the author guidelines and submit your work HERE
<https://journals.uu.se/tabm/about/submissions>.
View and please share our call for papers poster.
<https://www.canva.com/design/DAFs61fyXiY/Z59L3-xOjguExvTbabsoNw/view?utm_co…>
Best,
Biyanto R.
(biyanto.com)
Penafian: Anda tidak perlu membalas secepat mungkin, bila Anda menerima
surel ini pada akhir pekan atau hari libur.
Disclaimer: please do not feel obligated to respond my email during weekend
or holiday.
Dear list members,
I am happy to share my new paper which was just published OA in the Journal of Computational Social Science: https://doi.org/10.1007/s42001-023-00225-8
In this paper, I describe a new dataset which covers (almost) all offline meetings within the German-language version of Wikipedia from its launch in 2001 to March 2020. The dataset is published on OSF: https://osf.io/eha4r/
I would be delighted to see other researchers make use of this dataset - be it for substantial research on the intricacies of Wikipedia (I'm always up for collaborations, too!) or as a neat example dataset for teaching (be it on networks, temporal developments, spatial plotting, or on how to combine and work with small and big data).
A thank you goes out to the Wikimedia Foundation for supporting me with a project grant, and to the editor and reviewers handling the publication (you might well be on this list as well :-))!
Best
Nicole
Hi all,
The next Research Showcase, focused on *Rules on Wikipedia*, will be
live-streamed on Wednesday, September 20, at 9:30 AM PST / 16:30 UTC. Find
your local time here <https://zonestamp.toolforge.org/1695227400>.
YouTube stream: https://youtube.com/live/h89l9JWZBCU?feature=share
<https://www.google.com/url?q=https://youtube.com/live/h89l9JWZBCU?feature%3…>.
As usual, you can join the conversation in the YouTube chat as soon as the
showcase goes live.
This month's presentations:
Variation and overlap in the peer production of community rules: the case
of five WikipediasBy *Sohyeon Hwang, Northwestern University*
In this talk, I present work analyzing the rules and rule-making on
Wikipedia. The governance of many online communities relies on rules
created by participants. However, work predominantly focuses on efforts
within a single community or on a platform as a whole. Here we investigate
the comparative and relational dimensions of online self-governance in a
set of similar communities by looking at the five largest language editions
of Wikipedia. Using exhaustive trace data spanning almost 20 years since
their founding, we examine patterns in rule-making and overlaps in rule
sets. Our findings show that language editions have similar trajectories of
rule-making activity, replicating and extending a rich body of work that
have focused on English-language Wikipedia alone. We also find that the
language editions have increasingly unique rule sets, even as editing
activity concentrates on rules shared between them. The results suggest
that self-governing communities aligned in key ways may share a common core
of rules and rule-making practices even as they develop and sustain
institutional variations.
Wikipedia Community Policies and Experiential Epistemology: Critical
Information Literacy, Social Justice, and Inclusive PracticesBy *Zachary J.
McDowell, University of Illinois at Chicago*Drawing from a meta-analysis of
research on learning outcomes in Wikipedia-based education, this
presentation addresses Wikipedia community policies and practices through
the Framework for Information Literacy in Higher Education from the
Association of College and Research Libraries’ (ACRL). Wikipedia-based
educational practices, which promote newcomers’ active engagement in the
encyclopedia, have been shown to support experiential learnings in critical
information literacy, communication and research outcomes, and social
justice. Exploring the connections between participation in Wikipedia and
transferable skills for information literacy in the context of the current
new media landscape, this presentation grapples with new questions for the
future of information literacies alongside the implications of large
language models (LLMs), systemic biases, and the representation and
inclusion of non-western and indigenous knowledge sources.
You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Shshowcase
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase>
Best,
Kinneret
--
Kinneret Gordon
Lead Research Community Officer
Wikimedia Foundation <https://wikimediafoundation.org/>
Hi all,
I just want to pass information from my university.
Tidskrift för ABM (TABM) is an e-journal with the aim of publishing
research, reviews, travelogues and notes in the fields of Digital
Humanities, Archival Studies, Library and Information Studies and Museum
and Cultural Heritage Studies. We encourage and support master and PhD
students to publish with us, as well as others in the field. The journal is
run by an editorial board based at the Department of ALM at Uppsala
University. The journal publishes at least one issue a year, in the middle
of December.
You can find more details here <https://journals.uu.se/tabm>.
Best,
Biyanto R.
(biyanto.com)
Penafian: Anda tidak perlu membalas secepat mungkin, bila Anda menerima
surel ini pada akhir pekan atau hari libur.
Disclaimer: please do not feel obligated to respond my email during weekend
or holiday.