This paper (first reference) is the result of a class project I was part of
almost two years ago for CSCI 5417 Information Retrieval Systems. It builds
on a class project I did in CSCI 5832 Natural Language Processing and which
I presented at Wikimania '07. The project was very late as we didn't send
the final paper in until the day before new years. This technical report was
never really announced that I recall so I thought it would be interesting to
look briefly at the results. The goal of this paper was to break articles
down into surface features and latent features and then use those to study
the rating system being used, predict article quality and rank results in a
search engine. We used the [[random forests]] classifier which allowed us to
analyze the contribution of each feature to performance by looking directly
at the weights that were assigned. While the surface analysis was performed
on the whole english wikipedia, the latent analysis was performed on the
simple english wikipedia (it is more expensive to compute). = Surface
features = * Readability measures are the single best predictor of quality
that I have found, as defined by the Wikipedia Editorial Team (WET). The
[[Automated Readability Index]], [[Gunning Fog Index]] and [[Flesch-Kincaid
Grade Level]] were the strongest predictors, followed by length of article
html, number of paragraphs, [[Flesh Reading Ease]], [[Smog Grading]], number
of internal links, [[Laesbarhedsindex Readability Formula]], number of words
and number of references. Weakly predictive were number of to be's, number
of sentences, [[Coleman-Liau Index]], number of templates, PageRank, number
of external links, number of relative links. Not predictive (overall - see
the end of section 2 for the per-rating score breakdown): Number of h2 or
h3's, number of conjunctions, number of images*, average word length, number
of h4's, number of prepositions, number of pronouns, number of interlanguage
links, average syllables per word, number of nominalizations, article age
(based on page id), proportion of questions, average sentence length. :*
Number of images was actually by far the single strongest predictor of any
class, but only for Featured articles. Because it was so good at picking out
featured articles and somewhat good at picking out A and G articles the
classifier was confused in so many cases that the overall contribution of
this feature to classification performance is zero. :* Number of external
links is strongly predictive of Featured articles. :* The B class is highly
distinctive. It has a strong "signature," with high predictive value
assigned to many features. The Featured class is also very distinctive. F, B
and S (Stop/Stub) contain the most information.
:* A is the least distinct class, not being very different from F or G. =
Latent features = The algorithm used for latent analysis, which is an
analysis of the occurence of words in every document with respect to the
link structure of the encyclopedia ("concepts"), is [[Latent Dirichlet
Allocation]]. This part of the analysis was done by CS PhD student Praful
Mangalath. An example of what can be done with the result of this analysis
is that you provide a word (a search query) such as "hippie". You can then
look at the weight of every article for the word hippie. You can pick the
article with the largest weight, and then look at its link network. You can
pick out the articles that this article links to and/or which link to this
article that are also weighted strongly for the word hippie, while also
contributing maximally to this articles "hippieness". We tried this query in
our system (LDA), Google (site:en.wikipedia.org hippie), and the Simple
English Wikipedia's Lucene search engine. The breakdown of articles occuring
in the top ten search results for this word for those engines is: * LDA
only: [[Acid rock]], [[Aldeburgh Festival]], [[Anne Murray]], [[Carl
Radle]], [[Harry Nilsson]], [[Jack Kerouac]], [[Phil Spector]], [[Plastic
Ono Band]], [[Rock and Roll]], [[Salvador Allende]], [[Smothers brothers]],
[[Stanley Kubrick]]. * Google only: [[Glam Rock]], [[South Park]]. * Simple
only: [[African Americans]], [[Charles Manson]], [[Counterculture]], [[Drug
use]], [[Flower Power]], [[Nuclear weapons]], [[Phish]], [[Sexual
liberation]], [[Summer of Love]] * LDA & Google & Simple: [[Hippie]],
[[Human Be-in]], [[Students for a democratic society]], [[Woodstock
festival]] * LDA & Google: [[Psychedelic Pop]] * Google & Simple: [[Lysergic
acid diethylamide]], [[Summer of Love]] ( See the paper for the articles
produced for the keywords philosophy and economics ) = Discussion /
Conclusion = * The results of the latent analysis are totally up to your
perception. But what is interesting is that the LDA features predict the WET
ratings of quality just as well as the surface level features. Both feature
sets (surface and latent) both pull out all almost of the information that
the rating system bears. * The rating system devised by the WET is not
distinctive. You can best tell the difference between, grouped together,
Featured, A and Good articles vs B articles. Featured, A and Good articles
are also quite distinctive (Figure 1). Note that in this study we didn't
look at Start's and Stubs, but in earlier paper we did. :* This is
interesting when compared to this recent entry on the YouTube blog. "Five
Stars Dominate Ratings"
I think a sane, well researched (with actual subjects) rating system
well within the purview of the Usability Initiative. Helping people find and
create good content is what Wikipedia is all about. Having a solid rating
system allows you to reorganized the user interface, the Wikipedia
namespace, and the main namespace around good content and bad content as
needed. If you don't have a solid, information bearing rating system you
don't know what good content really is (really bad content is easy to spot).
:* My Wikimania talk was all about gathering data from people about articles
and using that to train machines to automatically pick out good content. You
ask people questions along dimensions that make sense to people, and give
the machine access to other surface features (such as a statistical measure
of readability, or length) and latent features (such as can be derived from
document word occurence and encyclopedia link structure). I referenced page
262 of Zen and the Art of Motorcycle Maintenance to give an example of the
kind of qualitative features I would ask people. It really depends on what
features end up bearing information, to be tested in "the lab". Each word is
an example dimension of quality: We have "*unity, vividness, authority,
economy, sensitivity, clarity, emphasis, flow, suspense, brilliance,
precision, proportion, depth and so on.*" You then use surface and latent
features to predict these values for all articles. You can also say, when a
person rates this article as high on the x scale, they also mean that it has
has this much of these surface and these latent features.
= References =
- DeHoust, C., Mangalath, P., Mingus., B. (2008). *Improving search in
Wikipedia through quality and concept discovery*. Technical Report.
- Rassbach, L., Mingus., B, Blackford, T. (2007). *Exploring the
feasibility of automatically rating online article quality*. Technical
I have asked and received permission to forward to you all this most
excellent bit of news.
The linguist list, is a most excellent resource for people interested in the
field of linguistics. As I mentioned some time ago they have had a funding
drive and in that funding drive they asked for a certain amount of money in
a given amount of days and they would then have a project on Wikipedia to
learn what needs doing to get better coverage for the field of linguistics.
What you will read in this mail that the total community of linguists are
asked to cooperate. I am really thrilled as it will also get us more
linguists interested in what we do. My hope is that a fraction will be
interested in the languages that they care for and help it become more
relevant. As a member of the "language prevention committee", I love to get
more knowledgeable people involved in our smaller projects. If it means that
we get more requests for more projects we will really feel embarrassed with
all the new projects we will have to approve because of the quality of the
Incubator content and the quality of the linguistic arguments why we should
approve yet another language :)
NB Is this not a really clever way of raising money; give us this much in
this time frame and we will then do this as a bonus...
---------- Forwarded message ----------
From: LINGUIST Network <linguist(a)linguistlist.org>
Date: Jun 18, 2007 6:53 PM
Subject: 18.1831, All: Call for Participation: Wikipedia Volunteers
LINGUIST List: Vol-18-1831. Mon Jun 18 2007. ISSN: 1068 - 4875.
Subject: 18.1831, All: Call for Participation: Wikipedia Volunteers
Moderators: Anthony Aristar, Eastern Michigan U <aristar(a)linguistlist.org>
Helen Aristar-Dry, Eastern Michigan U <hdry(a)linguistlist.org>
Reviews: Laura Welcher, Rosetta Project
The LINGUIST List is funded by Eastern Michigan University,
and donations from subscribers and publishers.
Editor for this issue: Ann Sawyer <sawyer(a)linguistlist.org>
To post to LINGUIST, use our convenient web form at
From: Hannah Morales < hannah(a)linguistlist.org >
Subject: Wikipedia Volunteers
-------------------------Message 1 ----------------------------------
Date: Mon, 18 Jun 2007 12:49:35
From: Hannah Morales < hannah(a)linguistlist.org >
Subject: Wikipedia Volunteers
As you may recall, one of our Fund Drive 2007 campaigns was called the
"Wikipedia Update Vote." We asked our viewers to consider earmarking their
donations to organize an update project on linguistics entries in the
English-language Wikipedia. You can find more background information on this
The speed with which we met our goal, thanks to the interest and generosity
our readers, was a sure sign that the linguistics community was enthusiastic
about the idea. Now that summer is upon us, and some of you may have a bit
leisure time, we are hoping that you will be able to help us get started on
Wikipedia project. The LINGUIST List's role in this project is a purely
organizational one. We will:
*Help, with your input, to identify major gaps in the Wikipedia materials or
pages that need improvement;
*Compile a list of linguistics pages that Wikipedia editors have identified
"in need of attention from an expert on the subject" or " does not cite any
references or sources," etc;
*Send out periodical calls for volunteer contributors on specific topics or
*Provide simple instructions on how to upload your entries into Wikipedia;
*Keep track of our project Wikipedians;
*Keep track of revisions and new entries;
*Work with Wikimedia Foundation to publicize the linguistics community's
We hope you are as enthusiastic about this effort as we are. Just to help us
get started looking at Wikipedia more critically, and to easily identify an
needing improvement, we suggest that you take a look at the List of
Many people are not listed there; others need to have more facts and
added. If you would like to participate in this exciting update effort,
respond by sending an email to LINGUIST Editor Hannah Morales at
hannah(a)linguistlist.org, suggesting what your role might be or which
entries you feel should be updated or added. Some linguists who saw our
on the Internet have already written us with specific suggestions, which we
share with you soon.
This update project will take major time and effort on all our parts. The
result will be a much richer internet resource of information on the breadth
depth of the field of linguistics. Our efforts should also stimulate
students to consider studying linguistics and to educate a wider public on
we do. Please consider participating.
Editor, Wikipedia Update Project
Linguistic Field(s): Not Applicable
LINGUIST List: Vol-18-1831
>>> The people who are loudest in their demands for consensus
>>> do not represent the Wikimedia movement.
>> The voices loudest for the WMF doing something against the
>> Trump administration are not representative of the Wikimedia
>> movement either....
> Is the Community Process Steering Committee currently
> prepared to "engage more 'quiet' members of our community"
> with a statistically robust snap survey to resolve this question?
Anyone can go to Recent Changes and send a SurveyMonkey link to the
most recent few hundred editors with contributions at least a year
old, to get an accurate answer.
Will a respected member of the community please do this? I would like
to know what the actual editing community thinks of the travel ban and
their idea of an appropriate response. I don't want to see community
governance by opt-in participation in obscure RFCs.
I would offer to do this myself, but I value keeping my real name
unassociated with my enwiki userid.
This is a request for your input and possible ideas (if any) regarding my management of the fallout from Jimmy's announcing myself as a 2018 Wikimedian of the Year.
Emails below are a copy of my ongoing consultations with Wikimedia Foundation staff and other Wikimedians I personally know, as well as a report on what's already brewing in my region of Russia after this unexpected outcome.
I would be grateful, if you can advise me on how to properly steer the enthusiasm of behalf of regional government, mass-media, NGOs, etc. which have just discovered about the possibility of participation in Wikimedia movement (think anything from U.S. is not getting much in-depth coverage in Russian by sources that regional public figures, NGOs, teachers or general regional journalists read) & are now placing great hopes on teaching whole of Tatarstan about how to Wiki & also engaging all Tatars globally (3/4 outside of Tatarstan, 1/5 outside of Russia).
* Selet WikiSchool got presented to wider public @ the press-conference for Tatar-speaking journalists (July 31), at the poster session in the framework of the World Congress of Tatars Youth Forum (Aug.3), and later at a Tatar projects fair in the Downtown park of Kazan (Aug.5)
* Contribution of those writing into Wikipedia in Tatar was recognized by choosing myself as one of the current year's "For the great service to Tatar nation" medal recipients. (Aug.3)
* We recorded a 40 min interview in Turkish (Aug.4, see https://www.youtube.com/watch?v=HK3tFBWMWcs )
* We didn't yet meet the President of the Republic (his schedule changed once again), but I got another firm request to meet with the Minister for Youth Affairs (another good acquaintance of mine, ex-member of local Comedy Club type activity), as well as with the Head of World Congress of Tatars' Executive Bureau, and more TV & written press interviews are lining up.
* I am trying to manage these guys' optimism & desire to move quickly, keeping in mind they are not really familiar with the Wiki-way or our communities' policies & practices
* Bashkir (ba) & Sakha (sah) communities have shared advice about how to assure that local Wikimedia community stays clearly independent in a cultural environment where neither Education, nor GLAM programs will attract local partners unless those have a strong support of the regional or federal government entities.
* Looking forward to meet with my Wikimania-2017 roommate from the Philippines in Singapore on Aug.16 to collect some more input from Asia
-------- Пересылаемое сообщение--------
03.08.2018, 19:35, "Фархад Фаткуллин / Farkhad Fatkullin" <frhd(a)yandex.com>:
Thursday, 2 August 2018
Republic of Tatarstan Ministry for Informatization & Communication @ IT-park, Kazan
Meeting with Tatarstan Deputy Prime Minister - Minister for Informatization & Communications Roman Shaykhutdinov
on opportunities Wikimedia projects & popularization thereof among the population of the Republic can bring for Tatarstan
* Almaz A. Valiullin, Director General of Tatarstan Center for Information Technologies http://mic.tatarstan.ru/eng/valiullin.htm
* Tatyana S. Kamaletdinova, CEO of Tatarstan Center for Information Technologies http://mic.tatarstan.ru/eng/kamaletdinova.htm
* Anna V. Yakovleva, Head of Ministry's Press-Service http://mic.tatarstan.ru/rus/about/structure_new?department_id=80576
* Farkhad N. Fatkullin, 2018 Wikimedian of the Year, Wikimedia Russia member https://meta.wikimedia.org/wiki/User:Frhdkazan
1. Content creation contests
2. Education projects
3. How can these be organized in a systemic way, engaging students throughout all municipal districts of Tatarstan & Tatar diaspora globally, and assessing effectiveness of measures
4. Larger Tatar language internet development, promoting content creation in the language
5. .tatar domain
7. Generating lists for content creation
8. Free licenses
10. Regional educational Wiki-seminar for beginners, introduction & orientation
11. Smartphone oriented educational projects around Tatar & Tatar Wikipedia (beginning with Tatar version of www.kakprav.com & on to WOK Master type )
12. OSM in Tatar
13. The way forward: visiting WMF to discover other opportunities, MoU with WMF?
1. Almaz Valiullin to head the Working Group on behalf of the Ministry
2. Next meeting set for / around 20th of August
3. Minister requested his staff to draft documents regarding moving all Regional & Municipal budget funded websites to CC-BY type free licenses
4. Minister proposed every organization website place a link to a Wikipedia article about it (in Tatar, Russian or English, depending on the language used)
5. Help is requested about developing simple samples or article creation guidelines for various institutions of Tatarstan & diaspora organizations to learn what is considered appropriate by Wiki-community & how should representatives of respective entities
6. Annual Tatar Internet Awards Ceremony to include awards to leading editors of Wikimedia projects in Tatar
7. Help is requested to develop a framework of Wikipedia Article Contests for Secondary School Children of Tatastan, with article quality assessment schemes (prizes & organization to be funded by Tatarstan goverment, promotion in local mass-media, schools & etc.)
8. Help is requested to develop a framework for a sustainable functioning of a Tatarstan & Tatar language oriented Wikimedia thematic organization to be responsible for systemic work around developing & promoting Wikimedia projects in Tatar, as well as Tatarstan-oriented content creation for various Wikis
9. Readiness to consider awarding *.tatar domain names to those, who develop attractive projects in Tatar that can't be hosted on Wikimedia platform or otherwise need a Tatar digital identity
10. Help is requested regarding drafting a program proposal, necessary budget & expected results for a Tatarstan oriented introductory Wiki-seminar, open for a wide public
11. Help is requested to provide links to Wikidata educational materials, Practical Use Cases & expert responses to the Deputy Minister (in English)
12. Help is requested in organizing a learning visit to WMF Headquarters during Tatarstan Delegation annual fall visit to Silicon Valley & other places in U.S. to discover other opportunities
Farkhad Fatkullin - Фархад Фаткуллин http://sikzn.ru/ Тел.+79274158066 / skype:frhdkazan / Wikipedia:frhdkazan
03.08.2018, 11:42, "Фархад Фаткуллин / Farkhad Fatkullin" <frhd(a)yandex.com>:
> Hi Nochole & Kui,
> Thank you for your responses & readiness to help.
> Events on the ground are developing following predictable course - we had a very constructive 3 hour long discussion last night with Tatarstan's Deputy Prime Minister (IT minister) & his team, whilst earlier Thursday I got an invite to meet with another Deputy Prime Minister (President of the National Council of Trustees for the World Congress of Tatars Association).
> More TV & magazine interviews are expected, some in detail articles in Russian promoting Wikimedia projects are in the pipelines (expecting to see the texts for necessary corrections & adding links to respective WMRU, Meta, WMF & related independent media articles (thanks to ComCom).
> I hope to find time to prepare & email both you (in English) & IT ministry's Working Group (in Russian) my summary of yesterday late evening meeting, with attendees, topics discussed, request & proposals, etc. They want to move as fast as possible, because this will help them to meet promoting Tatar language use online task they are charged with by the region's President.
> Just a few to begin with:
> * Ready to move all regional and municipal government websites into CC-BY, add links to respective WP articles from all these
> * Willing to have Tatarstan & Tatar language oriented thematic organization in place to work with secondary schools & Universities, GLAMS, diaspora
> * Ready to sponsor prizes for article writing contests (either through WMRU or this new entity)
> * Requesting guidance from myself & Wikimedia community on what's the best way to make this all successful
> * Interested in supporting WMRU organized general public educational Wiki-Seminar in Tatarstan Academy of Sciences or other respected facility with a large conference hall open to all public (September-October?)
> * Willing to visit WMF headquarters in November to meet, learn more about what's there (I admit I don't read all mailing lists or Meta discussions, don't have time to see all wonderful YouTube videos available, haven't yet have time to visit a single Wikimedia Conference) & possibly sign MoUs or whatever that would help speed up Wikimedia acceptance & popularity growth among the population of the region.
> I am in touch with Wikimedia RU & Wikimedia Languages of Russia Community's regarding ongoing developments, benefiting from collective wisdom of Wikimedians I know.
> Should you be ready to bring any ideas to the table, please shoot them my way.
> Farkhad Fatkullin - Фархад Фаткуллин http://sikzn.ru/ Тел.+79274158066 / skype:frhdkazan / Wikipedia:frhdkazan
> 02.08.2018, 16:35, "Nichole Saad" <nsaad(a)wikimedia.org>:
>> Hi Farhad,
>> First, congratulations on being named "Wikimedian of the Year!" We have received your letter, and are looking forward to engage with your ideas. Realistically, I'll be able to provide a more in depth response early next week.
>> best regards,
>> On Wed, Aug 1, 2018 at 2:45 PM Фархад Фаткуллин / Farkhad Fatkullin <frhd(a)yandex.com> wrote:
>>> Dear Sirs,
>>> This is from Farhad, User:frhdkazan (ComCom member, now a.k.a. 2018 Wikimedian of the Year) with some proposals about how to leverage a fleeting opportunity I got. Please read the text below this letter ASAP & think how you can help seize the moment. I'm looking forward to first ideas within 24 hours from now.
>>> Any support of yours would be greatly appreciated.
>>> Thanks a million.
>>> P.S. More on where I'm coming from can later be discovered @ http://frhd.narod.ru/resume-en.htm (dated, stopped my retainer contract with the Office of Tatarstan President in September 2013) & more up-to-date but in Russian @ http://sikzn.narod.ru/index/0-4 . I have previously translated & provided digital media with the link to Russian text of Wikimedia Blog post about Jess Wade https://ru.wikinews.org/wiki/?curid=176108 , & now contacted Asian colleagues to get similar content about Nahid Sultan & Wikimedia Bangladesh. Also on my agenda is to record a video-interview in Turkish for Wikimedia Turkey's communication campaign - I speak the language & we are in touch with Turkish community. I also need to get back to interpreting into Russian videos of Wikimania-2018 available on YouTube - maybe later this week, before I join my family for a few days off & then a week-long interpretation assignment in Singapore.
>>> Farkhad Fatkullin - Фархад Фаткуллин http://sikzn.ru/ Тел.+79274158066 / skype:frhdkazan / Wikipedia:frhdkazan
>>> Following Jimmy's unexpected pass during Wikimania-2018 Closing Ceremony, I became an instant celebrity of sort - myself alone I've seen over 40 articles with positive PR about Wikipedia & wider body of Wikimedia projects,
>>> * one by RT in Russian collecting over 1.1 million Facebook likes in under 8 hours from publication
>>> * short positive ones from Russian Federation government's Official gazette & Russkiy Mir Foundation
>>> * first ever mention that people can donate to Wikimedia Russia volunteers organization for us to be able to fund local Wiki-seminars, conferences & contests (WMF is not funding Wikimedia Russia & can't allow us to use donate link from Russian Wikipedia or place a link to "Thank you, but we don't accept donations from Russia" page where users from Russia are being routed
>>> * about to get one with links published to various WMF & Wikimedia Russia projects, including promoting free licenses, Education & GLAM programs, etc.
>>> I'm periodically collecting these @ https://tt.wikipedia.org/wiki/Википедия:Безнең_турында_матбугатта#Ел_викиме… when time permits, but there's more in my Facebook Messenger.
>>> This is also generating very positive attention on behalf of the Republic of Tatarstan government, as well as our regional NGO partner, Selet Youth Education Foundation, with whom yet unrecognized Tatar Wikimedians are jointly running Selet WikiSchool project. https://outreach.wikimedia.org/wiki/Education/News/May_2018/Selet_WikiSchool The latter want to introduce this joint initiative, as well as Wikimedia Russia executive director Stanislav Kozlovsky (Ph.D. in Psychology, Researcher & a Senior lecturer at Moscow State University) and myself to the President of Tatarstan during his visit to their Annual International Forum of Tatar-speaking High-School Age Youth on August 6th. Unrelated to that I was yesterday called up by a good acquaintance of mine (Tatarstan Vice Prime-Minister, Minister for IT https://en.wikipedia.org/wiki/Roman_Shaykhutdinov ), who invited me to have a talk with him tomorrow about how promoting Wikimedia projects in wider Tatar-speaking world (only about 25% of Tatars live in Tatarstan) can help him get whatever current President wants him to do to develop Tatar language use online. There are some other Tatarstan Prime-Minister Office level things in the pipeline, & I am also waiting for a call from Russian Federation ex-IT Minister https://en.wikipedia.org/wiki/Nikolay_Nikiforov , whom I know from his years in Tatarstan government as well (until 2012). FYI: Wikimedia movement is still unknown in Russia, with local mass media outlets are predominantly speaking about Wikipedia in Russian when there is some lapse that can be exploited to show one's Russian patriotism & thus score some points (basically same political circus, as what we are seeing in U.S. with Russia-collusion story in the overdrive, just opposite direction), and here we get something to reverse the situation big time, explaining to the wide public that it's not only Ok, but even desirable they engage with Wikimedia projects.
>>> Keeping in mind that Tatarstan President is the head of both Russia-Islamic World Strategic Vision Group & Association of Innovative Regions of Russia, Jimmy gave me a-hell-of-an-opportunity for an elevator pitch for Wikimedia movement in Russia and some adjacent counties (think Turkey & Central Asia), and I would really hate seeing it go unused. I will do my part here (about CC-BY for regional government controlled websites, Wikipedia Education Program as an extracurricular activity in all secondary school & universities, GLAM, as well as some locally funded carrots for participants), but I also want your help in driving this to a home-run. Every fall Tatarstan President & the delegation is visiting Silicon Valley & other places of interest in U.S. & it would be great if we could set-up a physical visit to Wikimedia Foundation, with a first hand presentation of best international practices the movement if proud of & have some MoU on cooperation signed (as a bureaucratic basis for continuing the conversation) with either some department in Tatarstan government, just like the one you signed with the Mexican Ministry of Culture www.eluniversal.com.mx/cultura/secretaria-de-cultura-y-wikimedia-firman-con…. To set the context, Tatarstan has such formal documents with a number of American companies & institutions, with our private English-speaking IT Univesity @ https://en.wikipedia.org/wiki/Innopolis having been developed jointly with Carnegie Mellon University (with consultancy fees paid by Tatastan), we have Russia's first publicly funded hospital certified to be meeting https://en.wikipedia.org/wiki/Joint_Commission requirements & plenty of U.S. investors @ our https://en.wikipedia.org/wiki/Alabuga_Special_Economic_Zone . On top of this, we could benefit from having Selet Youth Educational movement http://selet.biz/en/ to embrace Wiki even further, so maybe a similar MoU on collaboration between them and Wikiedu.org signed simultaneously would be great (I met LiAnna and Jami in Montreal last year & was really impressed with what these guys are doing).
>>> Please shoot something my way before the same time tomorrow, for me to handle the meeting with Tatarstan Deputy Prime Minister more effectively to progressively open other opportunities I've touched on. On top of what I described above & my last year's ideas @ https://meta.wikimedia.org/wiki/User:Frhdkazan/Wiki4RegionalDevt (in Russian), I'll be bringing to the table topics of:
>>> * Wikidata
>>> * OSM, as well as a
>>> * WOK-like project for all things in Tatar (I was contacted by a local 9th grader who, on his own initiative with a help of a teacher, did something similar for students who want to train for Russian-language SAT/ACT type exam www.kakprav.com & now offered to develop such or bigger thing for Tatar). For more on WOK, see https://www.vanguardngr.com/2018/01/wikipedia-wok-seek-nigerian-content-onl…
>>> * whatever else you or anybody else can help me think of until then & or later opportunities
>> Nichole Saad
>> Wikimedia Foundation | Senior Program Manager, Education
>> user: NSaad (WMF)
-------- Конец пересылаемого сообщения --------
Farkhad Fatkullin - Фархад Фаткуллин http://sikzn.ru/ Тел.+79274158066 / skype:frhdkazan / Wikipedia:frhdkazan
I was asked by a volunteer for help getting stats on the gender gap in
content on a certain Wikipedia, and came up with simple Wikidata Query
Service queries that pulled the total number of articles on a given
Wikipedia about men and about women, to calculate *the proportion of
articles about women out of all articles about humans*.
Then I was curious about how that wiki compared to other wikis, so I ran
the queries on a bunch of languages, and gathered the results into a table,
(please see the *caveat* there.)
I don't have time to fully write-up everything I find interesting in those
results, but I will quickly point out the following:
1. The Nepali statistic is simply astonishing! There must be a story
there. I'm keen on learning more about this, if anyone can shed light.
2. Evidently, ~13%-17% seems like a robust average of the proportion of
articles about women among all biographies.
3. among the top 10 largest wikis, Japanese is the least imbalanced. Good
job, Japanese Wikipedians! I wonder if you have a good sense of what
drives this relatively better balance. (my instinctive guess is pop culture
4. among the top 10 largest wikis, Russian is the most imbalanced.
5. I intend to re-generate these stats every two months or so, to
eventually have some sense of trends and changes.
6. Your efforts, particularly on small-to-medium wikis, can really make a
dent in these numbers! For example, it seems I am personally
responsible for almost 1% of the coverage of women on Hebrew Wikipedia!
7. I encourage you to share these numbers with your communities. Perhaps
you'd like to overtake the wiki just above yours? :)
8. I'm happy to add additional languages to the table, by request. Or you
can do it yourself, too. :)
 Yay #100wikidays :) https://meta.wikimedia.org/wiki/100wikidays
Wikimedia Foundation <http://www.wikimediafoundation.org>
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us make it a reality!
As I mentioned in an earlier thread , we will be running reader
surveys across a number of Wikipedia languages to learn about the
reader needs and motivations in these languages as well as some of
their demographic information (and perhaps the correlations between
demographics and user motivations and characteristics).
If your language community is interested to have statistics on the
distribution of reader gender, age, education, native language, and
geographic region (rural/urban) in your language (and depending on how
much data we collect in your language, perhaps more insights), this is
your chance to indicate interest at:
I initially communicated 2019-02-15 as the deadline to sign up. Since
then, we have run a pilot test on enwiki and we are investigating some
of the results to see if any changes in the survey questions are
needed. You have now time until 2019-03-15 to indicate interest.
As always: this call is primarily a service to your language
community. If you like it, take action on it. If you don't, no action
is needed. :)
The committee has finished selecting new members and the new committee
candidates are (In alphabetical order):
- Amir Sarabadani
- Lucie-Aimée Kaffee
- Tonina Zhelyazkova
- Tony Thomas
And auxiliary members will be (In alphabetical order):
- Nuria Ruiz
- Rosalie Perside
You can read more about the members in 
The changes are:
* Nuria and Rosalie are moving from main member to auxilary members
* MusikAnimal is moving from auxilary member to main
* Tonina Zhelyazkova is joining the main members
This is not the final structure. According to the CoC , the current
committee publishes the new members and call for public feedback for *six
weeks* and after that, the current committtee might apply changes to the
structure based on public feedback.
Please let the committee know if you have any concern regarding the members
and its structure until *19 June 2019* and after that, the new committee
will be in effect and will serve for a year.
Amir, On behalf of the Code of Conduct committee
I would like to inform you about my resignation from the Executive Board of
Shared Knowledge, effective from 1 July 2019.
After spending almost ten years on the boards of Wikimedia Macedonia and
Shared Knowledge with the last five years as president of Shared Knowledge,
this was a difficult decision for me but, in order to allow the
organisation adapt to the ever-changing environment in the movement, it
seems like the right time to make this step has come. The reasons for my
decision to step down are mostly personal, including lack of time, lack of
motivation, and hunger for new challenges.
To give a better insight of my board experience, I can effectively divide
it to two different but related periods: the first five years spent as a
board member of Wikimedia Macedonia as a passive chapter were marked by
extensive learning about the movement and the programmes of the other
chapters with a limited amount of educational activities; the second five
years spent as president of Shared Knowledge as an active user group were
marked by executing the learning into practice with several programmes
abundant with projects and events.
The exact meaning of my resignation is, however, not retirement from the
movement but rather a substantial reduction of the time spent on some
activities. My future plans are to remain active in the movement in other
I would like to thank you all for the collaboration so far and express my
hopefulness to extend it in my non-presidential role from now on. I will
not leave this mailing list and will continue following the news from the
Chair of the Executive Board of Shared Knowledge
Are there any plans to add ticket options that are more affordable,
especially for volunteers?
Even with Early Bird discount, $270 is... a lot, frankly (and after
today, $375). Prior to Montreal two years ago (Cape Town last year was
also similarly expensive), the registration cost of Wikimanias for
Wikimedia contributors was generally in the 30-50€ range for the whole
event - even Esino Lario, where food and full accommodation were
included in the default ticket, also had a 'simple' ticket option
skipping this that was still in the usual price range. Is there any
chance we could bring this practice back? Or... something?
This isn't even just that I can't afford this (which I can't - the
registration costs more than the plane ticket would), /a lot of us/
probably aren't going to be able to. Wikimania isn't like most
conferences, where attendees are being sent by their companies or
organisation; many of us who would consider going are individuals. Not
only do we not necessarily have any larger organisations to fund our
attendance, we're largely not getting paid for any of this, either -
we're donating our time to be a part of this movement, and now we're
expected to pay hundreds of dollars, as community members, to attend an
event that used to be specifically for the community?
This is especially going to be a major turnoff to any newcomers, as it
precludes people just registering and checking it out, seeing what's up,
unless they have a lot of money to throw around on things they're not
sure about. And based on the conversations I've had with various
newcomers over the years who have been drawn into such events (previous
wikimanias, hackathons, other conferences) and been highly engaged and
inspired by their experiences, this is apt to be a major loss.
I'll also note that while scholarships do resolve this issue for some
people, the scholarship budget is limited, and also generally focussed
on travel and accommodation costs for folks who would otherwise not be
able to get there, not attendance costs for people who can get there
just fine but would prefer to spend that 250€ on something else, like a
couple of months of groceries. There wasn't a 'just cover the
registration fee' option with the scholarship applications at all. Not
that... there should be?
Basically, would it be possible to maybe get some more options here?
On 24/05/2019 23:33, Isabel Cueva wrote:
> Great News! The Wikimania discount registration 'early bird' price
> period has been extended to May 31st! Details:
> Also, don’t forget: If you want to make a presentation, run a
> workshop, or display a poster during Wikimania, the Call for
> Submissions is NOW OPEN <https://wikimania.wikimedia.org/wiki/Wikimania>
> On Fri, May 17, 2019 at 10:22 AM Isabel Cueva <icueva(a)wikimedia.org
> <mailto:email@example.com>> wrote:
> Attention Everyone (and please spread the word):
> Early Bird Registration is now open for Wikimania 2019 on our
> This discount pricing ends on May 24th so don’t delay!
> Online registration will be open from today to July 30th, 2019.
> For more information please visit:
> Wikimania 2019 will be held at Stockholm University
> <https://en.wikipedia.org/wiki/Stockholm_University>, Sweden
> <https://en.wikipedia.org/wiki/Sweden>, from 14th to 18th August 2019.
> The venue will host the majority of the conference, hackathon,
> meetups, and pre-events.
> We would like to encourage all speakers and attendees to register
> early and book their flight and travel as soon as possible. If you
> have questions about visas, please visit our wiki visa page
> If you have any questions with regard to the conference, please
> wikimania-info(a)wikimedia.org <mailto:firstname.lastname@example.org>
> Don’t forget: If you want to make a presentation, run a workshop,
> or display a poster during Wikimania, the Call for Submissions is
> NOW OPEN <https://wikimania.wikimedia.org/wiki/Wikimania>
> We hope you can join us in Stockholm this summer!
> Isabel Cueva, WMF Event Program Manager
> on behalf of the Wikimania ‘19 Organizing Team
> *Isabel Cueva*
> Event Program Manager
> Wikimedia Foundation <https://wikimediafoundation.org/>
> *Isabel Cueva*
> Event Program Manager
> Wikimedia Foundation <https://wikimediafoundation.org/>
> Wikimania-l mailing list