Hello Srishti,

I couldn't verify if Krio had any native speaker contributors either. But it is a better choice than Wp/arn and I can live with it.
To get approval of the projects until 10 September is a fast timeline, but could be doable.
I think the next steps are:
1) Langcom - announcement of intended approval (on Meta)
2) WMF - contact communities to see if they want to be involved - if not already done?
3) Langcom - verification of the content. This could be the main bottleneck, but it could also be fast, depending on if experts are found.

Let me know what you think.

Am Mi., 4. Sept. 2024 um 21:48 Uhr schrieb Srishti Sethi <ssethi@wikimedia.org>:

Hello MF-Warburg,


Thanks a lot for your valuable insights! Please see our response inline:


On Fri, Aug 30, 2024 at 12:46 AM MF-Warburg <mfwarburg@googlemail.com> wrote:
The "number of editors" is according to the Catanalysis counting normally used by Langcom, which varies from the criteria mentioned in the proposal. Just to explain any discrepancy.

About the discrepancy in the number of editors, it could be because Catanalysis counts only categories, or it could be due to the fact that as part of the selection criteria, we excluded editors who edit across more than 5 languages in the Incubator, considering that they may not be associated with a specific language community and are generally enthusiastic about helping other communities.

All requests on Meta are marked as eligible. All five wikis would still require a verification of the content.

We agree with this.
 
As Sotiale already pointed out, Wp/tdd and Wp/rsk fulfill the approval criteria anyway, i.e. they don't need to be approved under this experimental scheme but could be approved normally. It seems to me that it would be unfair to "clearly mark these wikis as new to distinguish them from other production wikis for the pilot period" then.

Among the five data clusters formed for the experiment, the first two are related to low activity, while the last two are related to high activity. Wp/rsk has 1,000 edits in the last 3 months, and there are 6 languages at the same level. For Wp/tdd, there are 8 languages. So the experiment will allow us to compare Wp/rsk to the other similar 6 languages and compare Wp/tdd to the other similar 8 languages. We will not distinguish them as different from other production wikis, but will mark them in some way to indicate that they are being monitored.
 
I have my doubts about the suitability of Wp/arn, given the extremely low number of edits and editors. Also, as far as I could see, none of them seems to be a native speaker of the language, which we absolutely want to avoid. There is also still the old problem of the code being perceived as pejorative <https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Mapudungun_2#arn>.

Thanks for digging into Catanalysis stats! Taking your consideration into account, we are proposing an alternative suggestion for Arn. Here it goes for Krio language (kri):

Screenshot from 2024-09-04 18-41-08.png
 
Lastly, we would appreciate the Language Committee’s support in approving the 5 language wikis by September 10th. After this date, we would like to proceed to the next steps in the experiment. Your timely approval will help us stick to the project timeline and allow us sufficient time to monitor the wikis and learn from this experiment.

Cheers,
Srishti

Am Fr., 30. Aug. 2024 um 09:25 Uhr schrieb Srishti Sethi <ssethi@wikimedia.org>:
Hello all,

Thank you so much for taking the time to review the proposal and for sharing your thoughts and questions. Please see my response inline.

On Tue, Aug 27, 2024 at 5:25 AM Sotiale Wiki <sotiale.wm@gmail.com> wrote:
I haven't read the details so I don't know the criteria for the random samples that the feature shows, but if we rule out other criteria, it looks like rsk, tdd definitely meet the community sustainability criteria. I think it's probably just a list of the most likely ones in ascending order. But this alone could be very useful. It's definitely more convenient than having to manually check recent activity all the time.

That's very helpful to hear that metrics like this could be useful tools for monitoring activity.
 
On Mon, Aug 26, 2024 at 9:29 PM Denis Smajlović <deni@deni.dk> wrote: 
I am unable to get an overview of the exact changes that you are proposing to the process. I am specifically interested in: 
Why does the current system not work?
What specific changes do you suggest be implemented? 
 

Thanks for your question! This experiment addresses the issue of languages spending many years in the Incubator before they can graduate, as well as the technical challenges they face while editing. The technical challenges faced by contributors to small language versions of Wikipedia are also highlighted in the Language Diversity Hub’s research findings <https://commons.wikimedia.org/wiki/File:Barriers_experienced_by_contributors_to_small_language_versions_of_Wikipedia.pdf>. This experiment is a step forward, aiming to understand whether granting 5 test wikis (that meet the experiment’s selection criteria) access to their own Wikipedia sites and domains improves their editing experience compared to when they were in the Incubator. Specifically, it seeks to determine if access to modern wiki features that are available to Wikimedia wikis (e.g., Content Translation, Wikidata) play a role in their editing productivity.


2024년 8월 27일 (화) 오전 5:51, Tochi Precious <tochiprecious2@gmail.com>님이 작성:
I've checked through the criteria, and I have nothing more to add but a suggestion:
Why don't you also make it a combination of recently added wikis, as well as the older wikis. I noticed that the most recent one in the list has spent at least 2yrs in the incubator, maybe something a year or less. I would also like to see the kind of results this will produce. 

Thanks, Tochi, for your suggestion! For this experiment, the curated list of 35 languages meeting the inclusion and selection criteria ranges from 6 months to 16 years in the Incubator, with only 6 of these wikis having spent slightly less than 2 years. Since we need 5 wikis for the pilot, we have formed 5 clusters of languages ranging from low to mid to high activity (and across all time periods), with one language randomly selected from each cluster. We will observe the impact of the treatment at the cluster level and determine how this varies depending on the activity level of the project. Given the way we are clustering data and forming sets of languages, with each cluster meeting a specific set of criteria, it is essential to select a different language if we were to choose from within the same cluster. Regarding the 2-year time period, the closest we have is Pannonian Rusyn, which is about 2.24 years old.

We have also published a report about the methodology used, various approaches considered, and how we reached the current set of languages at <https://analytics.wikimedia.org/published/reports/languages_onboarding_experiment/1b_group_select.html>. For a quick read, you can refer to the “Background” and “Approach” sections and summary in the “Clustering” and “Sampling” sections.

We would like to hear any more thoughts and suggestions preferably by the end of this week!

Cheers,
Srishti

On Thu, Aug 22, 2024, 4:52 PM Srishti Sethi <ssethi@wikimedia.org> wrote:
Hello Language Committee,

I am writing today to share a proposal for an experiment addressing a new approach to onboarding a language wiki.


Since December 2023, we have had conversations with 35 relevant stakeholders, including three members from the Language Committee (Tochi, Mf-Warburg, and Jon), to develop recommendations addressing a few current challenges with the incubation journey. As a result of these discussions, several recommendations emerged, which are documented here https://www.mediawiki.org/wiki/Future_of_Language_Incubation/Recommendations which can be broadly grouped into the following two key areas:

  1. Streamlining technical infrastructure  

  2. Exploring social pathways


For the 2024-25 annual planned work of the Wikimedia Foundation and as part of the Content Growth objective (WE2/Knowledge Equity), the Language and Product Localization team with guidance from the Language Committee members, identified a recommendation that addresses some of the difficulties of content creation in the Incubator due to technical limitations of the platform. To address this, we would like to try the following: 


Identify a set of requests (maximum 5) from the list in the new wiki approval backlog which have been either already approved by the Language Committee and, prioritize their creation on the production infrastructure so that they do not have to continue writing content on the incubator wiki. At the end of a stipulated period we evaluate progress of these prioritized wikis compared to other test projects (approved or otherwise) still in the incubator. 


Please see the detailed proposal, including selection and inclusion criteria, timeline, implementation plan, and more information.  We also presented this proposal at Wikimania 2024: https://youtu.be/BbGrkYK8FEk?t=20299


After consultations with several other teams inside the WMF relevant to this area of work we believe this is a feasible starting point towards better content creation experiences for newer communities. To move onwards we would like to reach a shared agreement with the Language Committee and start off the pilot. Based on the criteria listed in the email, we would like to include as part of the experiment following list of wikis (also see attached screenshot): 


  • Mapudungun

  • Southern Ndebele

  • Obolo

  • Tai Nüa

  • Pannonian Rusyn


We would like to kick off this experiment as early as possible and would really appreciate hearing your suggestions on changes or additions to the selection criteria and initial list of wikis by August 24th.


Cheers,

Srishti


Srishti Sethi
Senior Developer Advocate
Wikimedia Foundation
screenshot_from_2024-08-07_19-41-25.png
_______________________________________________
Langcom mailing list -- langcom@lists.wikimedia.org
To unsubscribe send an email to langcom-leave@lists.wikimedia.org
_______________________________________________
Langcom mailing list -- langcom@lists.wikimedia.org
To unsubscribe send an email to langcom-leave@lists.wikimedia.org
_______________________________________________
Langcom mailing list -- langcom@lists.wikimedia.org
To unsubscribe send an email to langcom-leave@lists.wikimedia.org
_______________________________________________
Langcom mailing list -- langcom@lists.wikimedia.org
To unsubscribe send an email to langcom-leave@lists.wikimedia.org