Wikimedia-l August 2022

wikimedia-l@lists.wikimedia.org

69 participants
55 discussions

Episode 5 Inspiring Open Podcast episode is out
by Florence Devouard 07 Oct '22

07 Oct '22

Hello The Wiki Loves Women team launched a podcast a few weeks ago. We have released 5 episodes so far, with a frequency of two episodes per month. All episodes are available on the usual podcast platforms, or may be accessed on Wiki Loves Women website with additional notes about each episode. https://podcast.wikiloveswomen.org The latest episode features Angela Lungati, current CEO of Ushaidi. If you are interested to receive a brief message on your talk each time a new episode is published, please drop your name here : https://meta.wikimedia.org/wiki/Wiki_Loves_Women/Podcast#Subscribe Anthere ------------------ About Inspiring Open Inspiring Open is a podcast series about women from Wiki Loves Women that celebrates the inspirational women whose careers and personal ethics intersect with the Open movement. Each episode features a dynamic woman from Africa who has pushed the boundaries of what it means to build communities and succeed as a collective. As a podcast series, it is available at anytime, anywhere to amplify the motivational stories of each guest, as spoken in their own voice. Listen to their personal journeys in conversation with host Betty Kankam-Boadu. Join Inspiring Open as we raise the global visibility and profiles of women who are redefining and reclaiming the Open sector. Be inspired • Be challenged • Be bold!

2 4

[Foundation-l] Improving search in Wikipedia through quality and concept discovery
by Brian J Mingus 09 Sep '22

09 Sep '22

This paper (first reference) is the result of a class project I was part of almost two years ago for CSCI 5417 Information Retrieval Systems. It builds on a class project I did in CSCI 5832 Natural Language Processing and which I presented at Wikimania '07. The project was very late as we didn't send the final paper in until the day before new years. This technical report was never really announced that I recall so I thought it would be interesting to look briefly at the results. The goal of this paper was to break articles down into surface features and latent features and then use those to study the rating system being used, predict article quality and rank results in a search engine. We used the [[random forests]] classifier which allowed us to analyze the contribution of each feature to performance by looking directly at the weights that were assigned. While the surface analysis was performed on the whole english wikipedia, the latent analysis was performed on the simple english wikipedia (it is more expensive to compute). = Surface features = * Readability measures are the single best predictor of quality that I have found, as defined by the Wikipedia Editorial Team (WET). The [[Automated Readability Index]], [[Gunning Fog Index]] and [[Flesch-Kincaid Grade Level]] were the strongest predictors, followed by length of article html, number of paragraphs, [[Flesh Reading Ease]], [[Smog Grading]], number of internal links, [[Laesbarhedsindex Readability Formula]], number of words and number of references. Weakly predictive were number of to be's, number of sentences, [[Coleman-Liau Index]], number of templates, PageRank, number of external links, number of relative links. Not predictive (overall - see the end of section 2 for the per-rating score breakdown): Number of h2 or h3's, number of conjunctions, number of images*, average word length, number of h4's, number of prepositions, number of pronouns, number of interlanguage links, average syllables per word, number of nominalizations, article age (based on page id), proportion of questions, average sentence length. :* Number of images was actually by far the single strongest predictor of any class, but only for Featured articles. Because it was so good at picking out featured articles and somewhat good at picking out A and G articles the classifier was confused in so many cases that the overall contribution of this feature to classification performance is zero. :* Number of external links is strongly predictive of Featured articles. :* The B class is highly distinctive. It has a strong "signature," with high predictive value assigned to many features. The Featured class is also very distinctive. F, B and S (Stop/Stub) contain the most information. :* A is the least distinct class, not being very different from F or G. = Latent features = The algorithm used for latent analysis, which is an analysis of the occurence of words in every document with respect to the link structure of the encyclopedia ("concepts"), is [[Latent Dirichlet Allocation]]. This part of the analysis was done by CS PhD student Praful Mangalath. An example of what can be done with the result of this analysis is that you provide a word (a search query) such as "hippie". You can then look at the weight of every article for the word hippie. You can pick the article with the largest weight, and then look at its link network. You can pick out the articles that this article links to and/or which link to this article that are also weighted strongly for the word hippie, while also contributing maximally to this articles "hippieness". We tried this query in our system (LDA), Google (site:en.wikipedia.org hippie), and the Simple English Wikipedia's Lucene search engine. The breakdown of articles occuring in the top ten search results for this word for those engines is: * LDA only: [[Acid rock]], [[Aldeburgh Festival]], [[Anne Murray]], [[Carl Radle]], [[Harry Nilsson]], [[Jack Kerouac]], [[Phil Spector]], [[Plastic Ono Band]], [[Rock and Roll]], [[Salvador Allende]], [[Smothers brothers]], [[Stanley Kubrick]]. * Google only: [[Glam Rock]], [[South Park]]. * Simple only: [[African Americans]], [[Charles Manson]], [[Counterculture]], [[Drug use]], [[Flower Power]], [[Nuclear weapons]], [[Phish]], [[Sexual liberation]], [[Summer of Love]] * LDA & Google & Simple: [[Hippie]], [[Human Be-in]], [[Students for a democratic society]], [[Woodstock festival]] * LDA & Google: [[Psychedelic Pop]] * Google & Simple: [[Lysergic acid diethylamide]], [[Summer of Love]] ( See the paper for the articles produced for the keywords philosophy and economics ) = Discussion / Conclusion = * The results of the latent analysis are totally up to your perception. But what is interesting is that the LDA features predict the WET ratings of quality just as well as the surface level features. Both feature sets (surface and latent) both pull out all almost of the information that the rating system bears. * The rating system devised by the WET is not distinctive. You can best tell the difference between, grouped together, Featured, A and Good articles vs B articles. Featured, A and Good articles are also quite distinctive (Figure 1). Note that in this study we didn't look at Start's and Stubs, but in earlier paper we did. :* This is interesting when compared to this recent entry on the YouTube blog. "Five Stars Dominate Ratings" http://youtube-global.blogspot.com/2009/09/five-stars-dominate-ratings.html… I think a sane, well researched (with actual subjects) rating system is well within the purview of the Usability Initiative. Helping people find and create good content is what Wikipedia is all about. Having a solid rating system allows you to reorganized the user interface, the Wikipedia namespace, and the main namespace around good content and bad content as needed. If you don't have a solid, information bearing rating system you don't know what good content really is (really bad content is easy to spot). :* My Wikimania talk was all about gathering data from people about articles and using that to train machines to automatically pick out good content. You ask people questions along dimensions that make sense to people, and give the machine access to other surface features (such as a statistical measure of readability, or length) and latent features (such as can be derived from document word occurence and encyclopedia link structure). I referenced page 262 of Zen and the Art of Motorcycle Maintenance to give an example of the kind of qualitative features I would ask people. It really depends on what features end up bearing information, to be tested in "the lab". Each word is an example dimension of quality: We have "*unity, vividness, authority, economy, sensitivity, clarity, emphasis, flow, suspense, brilliance, precision, proportion, depth and so on.*" You then use surface and latent features to predict these values for all articles. You can also say, when a person rates this article as high on the x scale, they also mean that it has has this much of these surface and these latent features. = References = - DeHoust, C., Mangalath, P., Mingus., B. (2008). *Improving search in Wikipedia through quality and concept discovery*. Technical Report. PDF<http://grey.colorado.edu/mediawiki/sites/mingus/images/6/68/DeHoustMangalat…> - Rassbach, L., Mingus., B, Blackford, T. (2007). *Exploring the feasibility of automatically rating online article quality*. Technical Report. PDF<http://grey.colorado.edu/mediawiki/sites/mingus/images/d/d3/RassbachPincock…>

3 2

📚 Back for a third round! Apply to join the Training of Trainers for Reading Wikipedia in the Classroom
by Sailesh Patnaik 07 Sep '22

07 Sep '22

*tl;dr:* Looking for a way to engage secondary school teachers with Wikipedia? Become a Certified Trainer of the Reading Wikipedia in the Classroom program! Applications are open from Aug. 15 - Sept. 4. Information session coming up on August 25 at 13:30 UTC. Dear Wikimedians, The Education team at the Wikimedia Foundation is happy to announce that applications are open for the third cohort of the Training of Trainers (ToT) program for “Reading Wikipedia in the Classroom”! This call is open for Wikimedians, educators, and members of mission-aligned organizations who believe in the value of Wikipedia to advance key 21st-century skills <http://www.battelleforkids.org/networks/p21> [1] and who are interested in implementing the “Reading Wikipedia in the Classroom” program in their own countries. Reading Wikipedia in the Classroom <https://wikimediafoundation.org/our-work/education/reading-wikipedia-in-the…> [2] is a professional development program for teachers that guides them to approach Wikipedia through UNESCO’s Media and Information Literacy framework - leveraging the value of Wikipedia as a pedagogical tool. The educational resources <https://meta.wikimedia.org/wiki/Education/Reading_Wikipedia_in_the_Classroo…> of the program have already been adapted to 9 languages [3], and Certified Trainers of the program have led implementations in Nigeria <https://meta.wikimedia.org/wiki/Reading_Wikipedia_in_the_Classroom_Nigeria> [4], Bolivia <https://meta.wikimedia.org/wiki/Education/News/June_2022/Bolivian_Teachers_…> [5] and Guinea <https://youtu.be/ENoQjc4IG5g> [6], to name a few countries The Training of Trainers program is a 9-week online learning experience that prepares participants to implement the Reading Wikipedia in the Classroom program locally. By the end of the ToT, successful participants will be certified to lead the Reading Wikipedia in the Classroom program in their countries. Participants who successfully complete the ToT will also have access to a pool of funding that can support their first implementation of the program. Applications are open from August 15 to September 4, 2022. You can find the application form, important dates, and general guidance on Meta-Wiki <https://meta.wikimedia.org/wiki/Education/Reading_Wikipedia_in_the_Classroo…> [7] and check out this newsletter article <https://meta.wikimedia.org/wiki/Education/News/January_2022/Welcoming_new_t…> [8] to learn about the experience and learnings of the first ToT cohort in 2021. You’re invited to join the information session on August 25 at 13:30 UTC where we’ll share more details and answer any pending questions (find the link to join on Meta-Wiki). If you have any questions ahead of this event, please don’t hesitate to reach out to the team education(a)wikimedia.org We look forward to seeing you there! --the Education Team [1] http://www.battelleforkids.org/networks/p21 [2] https://wikimediafoundation.org/our-work/education/reading-wikipedia-in-the… [3] https://meta.wikimedia.org/wiki/Education/Reading_Wikipedia_in_the_Classroo… [4] https://meta.wikimedia.org/wiki/Reading_Wikipedia_in_the_Classroom_Nigeria [5] https://meta.wikimedia.org/wiki/Education/News/June_2022/Bolivian_Teachers_… [6] https://youtu.be/ENoQjc4IG5g [7] https://meta.wikimedia.org/wiki/Education/Reading_Wikipedia_in_the_Classroo… [8] https://meta.wikimedia.org/wiki/Education/News/January_2022/Welcoming_new_t… -- Sailesh Patnaik (He/Him/His) Program Officer, Education Wikimedia Foundation <https://wikimediafoundation.org/>

1 2

Outreachy Round 25–call for projects and mentors now open!
by Srishti Sethi 06 Sep '22

06 Sep '22

Hello everyone, Wikimedia is participating in the winter edition of this year's Outreachy < https://www.outreachy.org/> [1] (December 2022–March 2023)! The deadline to submit projects on the Outreachy website is September 30th, 2022. We are currently working on a list of interesting project ideas. If you have some ideas for coding or non-coding (design, documentation, translation, outreach, research) projects, share them here: < https://phabricator.wikimedia.org/T313361> [2]. *About the Outreachy program* Outreachy offers three-month internships to work remotely in Free and Open Source Software (FOSS), coding, and non-coding projects with experienced mentors. These internships run twice a year–from May to August and December to March. Interns are paid a stipend of USD 7000 for the three months of work. Interns often find employment after their internship with Outreachy sponsors or jobs that use the skills they learned during their internship. This program is open to both students and non-students. Outreachy expressly invites the following people to apply: * Women (both cis and trans), trans men, and genderqueer people. * Anyone who faces under-representation, systematic bias, or discrimination in the technology industry in their country of residence. * Residents and nationals of the United States of any gender who are Black/African American, Hispanic/Latinx, Native American/American Indian, Alaska Native, Native Hawaiian, or Pacific Islander. See a blog post highlighting the experiences and outcomes of interns who participated in a previous round of Outreachy with Wikimedia < https://techblog.wikimedia.org/2021/06/02/outreachy-round-21-experiences-an…> [3] *Tips for mentors for proposing projects* * Follow this task description template when you propose a project in Phabricator: < https://phabricator.wikimedia.org/tag/outreach-programs-projects> [4]. Add #Outreachy-Round-25 tag. * Project should require an experienced developer ~15 days and a newcomer ~3 months to complete. * Each project should have at least two mentors, with one of them holding a technical background. * Ideally, the project has no tight deadlines, a moderate learning curve, and fewer dependencies on Wikimedia's core infrastructure. Projects addressing the needs of a language community are most welcome. * If you don't have an idea in mind and would like to pick one from an existing list, check out these projects: < https://phabricator.wikimedia.org/tag/outreach-programs-projects/> [4] * To learn more about the roles and responsibilities of mentors, visit our resources on MediaWiki.org: < https://www.mediawiki.org/wiki/Outreachy/Mentors> [5]. We look forward to your participation! Cheers, Srishti [1] https://www.outreachy.org/ [2] https://phabricator.wikimedia.org/T313361 [3] https://techblog.wikimedia.org/2021/06/02/outreachy-round-21-experiences-an… [4] https://phabricator.wikimedia.org/tag/outreach-programs-projects/ [5] https://www.mediawiki.org/wiki/Outreachy/Mentors *Srishti Sethi* Senior Developer Advocate Wikimedia Foundation <https://wikimediafoundation.org/>

2 2

Is GoogleTV violating Wikipedia's license?
by F. Xavier Dengra i Grau 31 Aug '22

31 Aug '22

Hi all, I want to bring a legal concern here on Google's misuse of our content. [It came up today on Twitter](https://twitter.com/epineda/status/1564143156702199813?s=20&t=z2xu… that the GoogleTV app had linked a movie description text in Catalan language (which in principle it should be good news regarding language normalization). However, shortly after a wikipedian colleague realised that the text was fully taken by the Catalan Wikipedia. Once I downloaded the app by myself, I double-checked that Google does not specify anywhere (or at least that I could find minimally visible) that those lines belong to Wikipedia: neither the origin, the license, nor a link to the full article or to the CC license. I'd like to recall the licensing footpage on Wikipedia(Text is available under the [Creative Commons Attribution-ShareAlike License 3.0](https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attri…) and its conditions, as well as to ask others to check whether there's more situations like this one. It's worth noting how wrong this is to minoritised language Wikipedias: not only the legal issue itself, but also the lack of legitimate clicks and views that we end up losing, the confusion and misunderstandings from the readers that think this is a win by Google (the example I shared, with both screenshots enclosed), and even a subsequent chicken-and-egg situation that can lead to deleted articles by some users thinking that the content was stolen from Google and not actually the opposite. I remember that there was a previous thread here, not so long ago, about the problems of Google taking over our data and therefore diminishing clicks to the Wikimedia projects. Considering that I am fully against the GAFAM-drift that the WMF is increasingly adopting by benefiting from Google in our human, economical and digital structures, I prefer to share it here as well -and not only to the legal team of the WMF (cced). Kind regards, Xavier Dengra

6 6

Invitation to episode 15 of WikiAfrica Hour
by Ceslause Ogbonnaya 30 Aug '22

30 Aug '22

Dear Wikimedians, I'm delighted to invite you to episode 15 of WikiAfrica Hour, titled *Wikimedians In Residence*. The session is focused on shining light on the marriage between Wikimedia and some host organisations and its members who are interested in a productive relationship with the encyclopedia and its community. To make the experience a fun and memorable one, we have invited the following guest speakers: 1. Bobby Shabangu - WiR, United Nations Development Programme,South Africa 2. Florence Devouard - WiR, World Intellectual Property Organization (WIPO) 3. Nicolas Vigneron - WiR, Clermont Auvergne University 4. Alice Kibombo - WiR, African Library and Information Associations (AfLIA) 5. Daniel Obiokeke - WiR, The Africa Narrative Date: 2nd September 2022 Time: 4pm UTC Details: https://w.wiki/5dft <https://t.co/Z5FogXbr5l> Regards, Ceslause Ogbonnaya *Host,WikiAfrica Hour *

2 1

[Small Wiki Toolkit] Writing Wikidata Queries Using WDQS Tool Workshop On Tuesday, August 30th, 16:00 UTC
by Seyram Komla Sapaty 30 Aug '22

30 Aug '22

Hello everyone, The seventh workshop on the topic of "Writing Wikidata queries using the WDQS tool" is coming up - it will take place on Tuesday, August 30th at 16:00 UTC. You can find more details on the workshop and a link to join here <https://meta.wikimedia.org/wiki/Small_wiki_toolkits/Workshops#Writing_Wikid…> [1]. This session will focus on how to access and save information to Wikidata, including through page generators and queries++, using pywikibot. It will also cover the Wikidata conventions about bot-editing, including the bot approvals process. To participate in this workshop, you would need basic familiarity with Wikidata and Pywikibot installation. You can add your discussion ideas on the etherpad doc <https://etherpad.wikimedia.org/p/swt-wikidata-pywikibot-scripts>.[2] We look forward to your participation! [1]: https://meta.wikimedia.org/wiki/Small_wiki_toolkits/Workshops#Writing_Wikid… [2]: https://etherpad.wikimedia.org/p/swt-wikidata-pywikibot-scripts Thanks. -- Seyram Komla Sapaty Developer Advocate Wikimedia Cloud Services

1 2

The 2022 Board of Trustees election Community Voting is about to Close
by Mahuton Possoupe 30 Aug '22

30 Aug '22

Hello, The Community Voting period of the 2022 Board of Trustees election started on August 23, 2022, and will close on September 6, 2022 23:59 UTC. There’s still a chance to participate in this election. If you did not vote, please visit the SecurePoll voting page <https://meta.wikimedia.org/wiki/Special:SecurePoll/vote/Wikimedia_Foundatio…> to vote now. To see about your voter eligibility, please visit the voter eligibility page <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…>. If you need help in making your decision, here are some helpful links: - Try the Election Compass <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…>, showing how candidates stand on 15 different topics. - Read the candidate statements <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…> and answers to affiliates' questions <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…> . - Learn more about the skills the Board seeks <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…> and how the Analysis Committee found candidates align with those skills <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…> - Watch the videos of the candidates answering questions proposed by the community <https://meta.wikimedia.org/wiki/Special:MyLanguage/Wikimedia_Foundation_ele…> . Best, Movement Strategy and Governance

1 0

Call for applications: Africa Knowledge Initiative (AKI) Implementing partners/Appel à candidatures: Partenaires d'exécution d’Africa Knowledge Initiative (AKI)
by Ceslause Ogbonnaya 29 Aug '22

29 Aug '22

Dear all, The Africa Knowledge Initiative (AKI) working group is happy to announce a call for applications for Implementing partners for the various campaigns in the project <https://diff.wikimedia.org/2022/08/04/africa-knowledge-initiative-welcomes-…> . The Implementing partners will be responsible for organizing a campaign in one of these African Union Holidays: - Africa Youth Day – Nov 1, 2022 - Environment/Wangari Maathai Day – Mar 3, 2023 - Africa Day – May 25, 2023 Interested organizations, affiliates (user groups/chapters) and groups are welcome to submit applications via this application form <https://docs.google.com/forms/d/e/1FAIpQLSfyaySvNhTPF-F0WvwMdRHIUcWGgyYAtDc…> . *Minimum Requirements* - Must be an organization, affiliate (user groups or chapters) or a group - Experience organizing international campaigns - Experience organizing around the topical area of interest - Must be in good standing with the grants system/affcom requirements at the WMF *NB:* Only shortlisted applicants will be contacted. *Application deadline is 4th September 2022*. Enquiries about the application or the Africa Knowledge Initiative (AKI) project should be sent to campaigns(a)wikimedia.org. Regards, Ceslause Ogbonnaya *Wikimedian In Residence, Africa Knowledge Initiative (AKI)* Chers Wikimédien.ne.s, Le groupe de travail Africa Knowledge Initiative (AKI) est heureux d'annoncer un appel à candidatures pour les partenaires d'exécution des différentes campagnes du projet <https://diff.wikimedia.org/2022/08/04/africa-knowledge-initiative-welcomes-…> . Les partenaires d'exécution seront responsables de l'organisation d'une campagne dans l'un de ces jours fériés de l'Union Africaine: - Journée de jeunesse africaine – 1er novembre 2022 - Journée d'environnement/Wangari Maathai - 3 mars 2023 - Journée d'Afrique – 25 mai 2023 Les organisations intéressées, les affiliés (groupes d'utilisateurs/chapitres) et les groupes sont invités à soumettre des candidatures via ce formulaire de candidature <https://docs.google.com/forms/d/e/1FAIpQLSfyaySvNhTPF-F0WvwMdRHIUcWGgyYAtDc…> . *Exigences minimales* - Doit être une organisation, une affilié (groupes d'utilisateurs ou chapitres) ou un groupe - Expérience dans l'organisation de campagnes internationales - Expérience d'organisation autour du domaine d'intérêt - Doit être en règle avec les exigences du système de subventions / d’Affcom à la Fondation Wikimedia *NOTEZ:* Seuls les candidats présélectionnés seront contactés. *La date limite de candidature est le 4 septembre 2022*. Les demandes de renseignements sur la candidature ou le projet Africa Knowledge Initiative (AKI) doivent être envoyées à campagnes(a)wikimedia.org. Cordialement, Ceslause Ogbonnaya *Wikimédien en R**é**sidence, Africa Knowledge Initiative (AKI)*

1 1

Presenting about transparency in Wikimedia's technical spaces - July 23, 17:00 UTC
by Kunal Mehta 28 Aug '22

28 Aug '22

Hi, Tomorrow at the HOPE 2022 conference, I'm giving a talk titled, "How to Run a Top-10 Website, Publicly and Transparently", discussing the impact of transparency in Wikimedia's technical spaces. A number of people have expressed interest in watching, including non-technical users, so I'm advertising it a bit more broadly. I apologize for the short notice, I didn't realize the stream would be free to watch until yesterday (thanks Ori!). Time: 2022-07-23 17:00 UTC (1pm ET) - https://zonestamp.toolforge.org/1658595637 Stream: https://hope.net/416dac.html If you can't watch it live, a recording will be uploaded later on. I've documented all of this on-wiki, including the full abstract: <https://meta.wikimedia.org/wiki/User:Legoktm/HOPE_2022>. I am of course happy to answer any questions people might have after the talk! Thanks, -- Kunal / Legoktm

1 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Wikimedia-l August 2022