The on-wiki version of this newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-04-29
This week, I want to start with a shoutout to our phenomenal volunteers.
Lexicographical coverage
My thanks to Nikki <https://www.wikidata.org/wiki/User:Nikki> and their
updates on the dashboards about lexicographical coverage
<https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage>. Since
the first publication of the dashboard, Nikki has kept the dashboards up to
date, re-running them from time to time and updating the page on Wikidata.
They and others have also fixed numerous issues, created more actionable
lists, and added more languages based on other corpora than Wikipedia (most
notably from the Leipzig Corpora Collection
<https://wortschatz.uni-leipzig.de/en>). Thanks also to Mahir
<https://www.wikidata.org/wiki/User:Mahir256>, who also contributed to the
dashboard, particularly covering Bengali, one of our focus languages.
In fact, thanks to Nikki and Mahir, the four main focus languages are now
all covered: we have numbers for Bengali, Malayalam, Hausa, and Igbo. We
are still missing our stretch focus language, Dagbani, because we could not
find yet a corpus. We have reached out to a researcher who has compiled a
Dagbani corpus
<https://www.aflat.org/content/corpus-building-predominantly-oral-culture-no…>,
and we also are exploring how we could use the Dagbani Wikipedia
<https://incubator.wikimedia.org/wiki/Wp/dag> on Incubator
<https://incubator.wikimedia.org/wiki/Incubator:Main_Page>. In the
meantime, we are pleased to see that the Dagbani community has put in a request
for a new Wikipedia edition
<https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Dagbani>
and that they feel that they are ready to graduate from incubator!
Congratulations!
Some of the results of highlighting the dashboard, and particularly the
list of most frequent missing lexemes, were very promising: coverage in a
number of languages has increased considerably. To just list a few
examples: Polish went from 16% to 32% coverage, German from 53% to 67%,
Czech from 44% to 57% — and Hindi went from a mere 1% to 15%, and Malay
from 15% to an astonishing 53%! Congratulations to those communities and
others for such visible progress.
With an eye on our focus languages, Bengali went from 18% to 28%, Malayalam
is at 21%, whereas Hausa and Igbo both have coverages of below 1%.
Another great tool to see the progress in lexicographical knowledge
coverage in Wikidata is Ordia <https://ordia.toolforge.org/>, developed by Finn
Årup Nielsen <https://meta.wikimedia.org/wiki/User:Fnielsen>. Ordia is a
holistic user experience that allows users to browse and slice and dice the
lexicographic data in Wikidata in real time. We can take a look at the 11,400
Malayalam Lexemes <https://ordia.toolforge.org/language/Q36236>, the 8,724
Bengali Lexemes <https://ordia.toolforge.org/language/Q9610>, 53 Dagbani
Lexemes <https://ordia.toolforge.org/language/Q32238>, 15 Hausa Lexemes
<https://ordia.toolforge.org/language/Q56475>, and the single Lexeme in Igbo
<https://ordia.toolforge.org/language/Q33578>, mmiri, the Igbo word for
water. Thanks to Finn for Ordia!
Making the state of the lexicographical coverage visible shows us that
there is still a lot to do — but also that we are already achieving
noticeable progress! Thanks to everyone contributing.
By the way, the annotation wiki <https://annotation.wmcloud.org/> is
currently having issues. If you would like to help us with running it and
have experience with Vagrant and Cloud VPS based wikis, please drop me a
line on my talk page <https://meta.wikimedia.org/wiki/User_talk:Denny>.
A first running function call!
Lucas Werkmeister <https://meta.wikimedia.org/wiki/User:Lucas_Werkmeister>
consistently keeps being amazing. He is working on GraalEneyj
<https://github.com/lucaswerkmeister/graaleneyj>, a GraalVM
<https://www.graalvm.org/>-based evaluation engine for Wikifunctions,
written in Java. Lucas re-wrote GraalEneyj to be able to call a function
all directly from the notwikilambda test-wiki — the very first time that
one of our functions is being evaluated! You can watch that moment in a
Twitch video <https://www.twitch.tv/videos/975239172>.
We are still working on replicating that feat in what will be our
production codebase, and hope to soon connect our backend evaluating
functions with the wiki — this is our goal for the ongoing Phase δ
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B4_(del…>
(delta). Congratulations to Lucas for achieving this step!
Delay on logo
There will be a delay on the logo finalization. Please expect another month
or two before we will have news to share about the logo. Due to the legal
nature of some of the involved issues, we have decided to not share details
in public. Sorry for the delay, and I am looking forward to sharing the
next steps in this process.
New documents
We have been working for a while with the Wikimedia Architecture Team on a
number of artefacts around Abstract Wikipedia and Wikifunctions. We have
now published and shared these documents in the Architecture repository
<https://www.mediawiki.org/wiki/Architecture_Repository/Strategy/Goals_and_i…>.
We are aiming to keep publishing our design documents and related
development artefacts, and are happy to invite you to this set of documents.
Based on requests from the community, we also worked on a new example of an
article in abstract content
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Examples/Jupiter>. The
example is not complete, and is open to being edited and discussed. Note
that this is not meant to be prescriptive of how abstract content should
look like, but merely a more concrete hypothetical example of what it could
look like. I am confident that the community as a whole will come up with
better abstractions than I did. Please do edit or fork that page.
There will be three approaches towards creating an implementation for a
function in Wikifunctions, and the current and following two phases of
development are each dedicated to one of those approaches: (1) allow to
call a built-in implementation in the evaluator engine, (2) allow to call
native code in a programming language, and (3) compose other functions to
implement a new function. In preparation for the upcoming Phase ζ
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B6_(zet…>
(zeta), we have created a few examples of function composition
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Examples/Function_compos…>
.
The on-wiki version of this newsletter is here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-04-21
We are very happy to announce that we have been joined by Aishwarya
Vardhana and Carolyn Li-Madeo. Aishwarya will join the team permanently,
and Carolyn will support us over the course of the next few weeks. They are
both exceptional UX designers and we are very much looking forward to
working with them.
I'll let them both introduce themselves with their own words:
Aishwarya:
Hello and namaste everyone! My name is Aishwarya (she/her). I am so happy
to be joining the AW team to assist their efforts towards building a deeply
multilingual and multicultural Wikipedia. I grew up in a household
committed to preserving indigenous knowledge through art, culture, and oral
tradition. At a young age I read about the anticolonial teachings of
Mahatma Gandhi and Dalit activist B.R. Ambedkar, and I believe this was the
origins of my interest in knowledge production and knowledge equity. I
studied Art and Art History and Product Design at Stanford University where
I was an organizer for Black Lives Matter. I recently finished a 10 week
course on an Indigenous Study Guide to Abolition at the Yellowhead
Institute, found it revelatory, and have been thinking a lot about
decolonizing systems! In this vein, I am invested in creating more
connectivity between the worlds of technology, design, and social
movements. I like to write about decolonizing design here
<https://aishwaryavardhana.medium.com/>. I look forward to building with
all of you. Favorite Wikipedia page: Critical Pedagogy
<https://en.wikipedia.org/wiki/Critical_pedagogy>
Carolyn:
Hi! My name is Carolyn (she/her) and I’m excited to be supporting Aishwarya
and the Abstract Wikipedia team for the next 6 weeks. I like to work on
projects that empower people to question, learn and explore in new ways. In
a previous life I was a reference librarian at a commuter college in my
beautiful hometown of Brooklyn, NY, which is where I became passionate
about free and usable access to knowledge via the internet. For the past
four years I have been the designer of the Wikipedia iOS app, where it’s
been a pleasure to design and refine products for mobile readers and
editors. During my time with the Abstract Wikipedia team I hope to be a
steward of design thinking in partnership with Aishwarya, as well as begin
to develop a modular and human-centered design roadmap for the project. My
favorite Wikipedia list is a List of sundial mottos
<https://en.wikipedia.org/wiki/List_of_sundial_mottos>, my favorite article
is Women in climate change
<https://en.wikipedia.org/wiki/Women_in_climate_change>.
We are positively thrilled to have them on board. In the coming weeks, we
will share more about how we think about our UX challenges. We think that
it is crucial for Wikifunctions to be truly accessible and approachable, in
order to achieve our goal of democratizing access to code and functions.
With Aishwarya and Carolyn joining us, we can finally work toward that
piece of the puzzle in order to make Wikifunctions the project it could be.
The on-wiki version of this update can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-04-15
A few weeks ago, we sent out a call for focus languages for improving the
lexicographic extension of Wikidata, and so for Abstract Wikipedia. Over
the last few weeks, we have received submissions from nine language
communities: Bengali, Dagbani, Danish, Esperanto, French, Hausa, Igbo,
Malayalam, and Russian. We want to thank all of these communities for their
trust and all of the submitters for their work.
The Wikidata team at Wikimedia Deutschland and the Wikifunctions and
Abstract Wikipedia team at the Wikimedia Foundation have deliberated and
discussed the options over the last few days. Following our own criteria
<https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Focus_languages…>,
we chose the following four languages as focus languages (plus one stretch
language). Lydia Pintscher <lydia.pintscher(a)wikimedia.de> and Denny
Vrandečić <vrandecic(a)gmail.com> announced the languages at the closing
event of the 30 Lexic-o-days
<https://www.wikidata.org/wiki/Wikidata:Events/30_lexic-o-days_2021> on
Wednesday, 14 April:
-
Bengali is an Indo-European language (belonging to the Eastern
Indo-Aryan branch), spoken in Bangladesh and India, there mostly in the
eastern Indian states of West Bengal and Tripura. Bengali is spoken by more
than 220 Million native speakers, making it the fifth most-spoken language
in the world. The Benagli Wikipedia has more than 100,000 articles and more
than 1,700 active contributors. Like a few other languages in the region,
it is written using the Bengali-Assamese script, an abugida and Brahmic
script. Bengali is easily the most widely-spoken language and the largest
Wikipedia community among the focus language communities (see more on the
English Wikipedia's article
<https://en.wikipedia.org/wiki/Bengali_language>)
-
Malayalam is a Dravidian language, spoken mainly in the southern Indian
state of Kerala. Malayalam is spoken by more than 30 Million native
speakers. The Malayalam Wikipedia has more than 70,000 articles and about
300 active contributors. Of note is also the Malayalam Wiktionary, which
has more than 130,000 entries, outnumbering Wikipedia in Malayalam.
Malayalam is written in the Malayalam script, which, like Bengali-Assamese,
is an abugida and Brahmic script. The Malayalam community has been active
on Wikidata, with numerous local identifiers and good usage of the data in
Wikipedia (see more on the English Wikipedia's article
<https://en.wikipedia.org/wiki/Malayalam>)
-
Hausa is an Afroasiatic language (belonging to the Chadic branch) and an
official language in Nigeria, Ghana, and Niger. The estimates about native
speakers range between 50 and 150 million. It is the most important
indigeneous lingua franca in West and Central Africa. Hausa Wikipedia has
more than 8,000 articles and about 50 active contributors. Whereas Hausa
was historically written in an Arabic alphabet, today it is mostly written
in a Latin-based alphabet. The community has used Wikidata in infoboxes and
has brought a number of modules and templates to Hausa (see more on the
English Wikipedia's article
<https://en.wikipedia.org/wiki/Hausa_language>)
-
Igbo is a Niger-Congo language (belonging to the Volta-Niger branch)
spoken mainly in south-eastern Nigeria with about 45 million native
speakers. Igbo Wikipedia has about 2,000 articles and about 50 active
contributors. Igbo is using a Latin-based script, although it has both an
interesting history with the Nsibidi ideograms and possibly an interesting
future with the Ndebe script. The community is using the ArticlePlaceholder
extension in the Igbo Wikipedia (see more on the English Wikipedia's
article <https://en.wikipedia.org/wiki/Igbo_language>)
-
Dagbani is a Niger-Congo language (belonging to the Gur branch) spoken
in northern Ghana. Igbo has about 3 million native speakers. Dagbani
Wikipedia is still in Incubator, and already has more than 400 articles. It
is written in a Latin-based alphabet. Whereas Dagbani didn’t fulfill some
of our criteria, the community that applied was very enthusiastic. We
regard Dagbani as our stretch goal: it will be instructive to learn whether
we can achieve our goals even for such a small language community (see more
on the English Wikipedia's article
<https://en.wikipedia.org/wiki/Dagbani_language>)
Additionally, we will use English as a demonstration language, in order to
showcase the developed features to a wider audience.
We will continue to listen for input from all language communities and seek
conversation with all communities. The point of the focus languages is to
be able to reach out to some communities quickly, for testing out designs
and prototypes, and to see if certain features also work beyond the
languages the developers can test easily.
Thanks again to everyone who participated in the process and who helped to
run it smoothly!
The on-wiki version of this update can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-04-08
A few days ago we were asked for a mission statement for Wikifunctions, and
we realized we didn’t have one yet. James Forrester then wrote a first
draft, which the team slightly improved, intentionally inspired by the
language used in the Wikimedia movement's vision:
“A Wikimedia project for everyone to collaboratively create and maintain a
library of code functions to support the Wikimedia projects and beyond, for
everyone to call and re-use in the world's natural and programming
languages.”
I don’t expect this to be the final version, and you are all invited to
improve it. It was pointed out to us that it is still a bit long and wordy.
That said, let's take a closer look at the text as it is now:
*“A Wikimedia project”*: Wikifunctions is a project
<https://meta.wikimedia.org/wiki/Wikimedia_projects> by and of the
Wikimedia movement, in the same sense as Wikipedia, Wikidata, Wiktionary,
Wikimedia Commons, and the other seven projects. In that sense, it is both
a wiki website and a community.
*“for everyone”*: When you think about the question of who will benefit
from Wikifunctions, the answer is that we target everyone. There are
practical limitations in reaching truly everyone: for example, a computer
is a prerequisite for being able to benefit from Wikifunctions. That is
different from e.g. Wikipedia or Wikivoyage, where the print-out of a page
can be very useful without having to have a computer. But for a function to
be useful, a computer will be needed in order to evaluate the function.
Besides such limitations, we aim to be accessible
<https://en.wikipedia.org/wiki/Computer_accessibility>, to be multilingual,
and to run in many different contexts, both on- and offline.
*“to collaboratively create and maintain”*: Collaborative creation and
maintenance is the core tenet of the Wikimedia projects. We don’t want
individuals to have "ownership" over a function or a set of functions, to
control what is accepted to the project, or decide which changes would be
welcome. This is a collaborative effort, and the default assumption is that
everyone can contribute new function definitions, implementations for new
functions, that everyone can improve the documentation and the test
coverage of these.
*“a library of code functions”*: This defines the new knowledge format that
this project is addressing: code functions
<https://en.wikipedia.org/wiki/Subroutine>. We are not talking about
mathematical
functions <https://en.wikipedia.org/wiki/Function_(mathematics)>, but about
functions for which we can provide executable implementations. We want to
build a shared library of such functions that are connected with each
other, that use each other, and that are built on top of each other. A
single, comprehensive, common library also helps us to truly build on each
other’s work. For example, if one searches the Web today for a function
that calculates how many days have passed between two given dates, one can
easily find faulty implementations that do not take leap days into account.
By having one common library of functions to which anyone can contribute,
we hope that we can increase the overall quality of code in the world.
*“to support the Wikimedia projects”*: As with Commons and Wikidata, the
founding, primary goal of Wikifunctions is to support the Wikimedia
projects. We want to first focus on functions that are useful for the
Wikimedia projects, that can help reduce the maintenance costs in the other
projects, and unlock new capabilities for the projects that have not been
possible before.
*“and beyond”*: Alongside the primary goal, we don’t want to restrict the
functions to only what is directly useful for the Wikimedia projects. On
the contrary, we want to provide a comprehensive library of functions
useful in many different areas: text processing, mathematics, natural
sciences, health care, environmental studies, decision making, natural
language generation, and many other areas.
*“for everyone to call”*: Everyone will be able to go to Wikifunctions,
find a function, enter the input arguments, and have the system evaluate
the function and see the result. We plan that making these calls will be
possible via direct evaluation on the Wikifunctions site, via inclusion on
a Wikimedia project, and via API calls. We expect there to be substantial
value to the world exposed through the API and its use on third party
sites, tools, and apps, in the same way that Wikidata statements, Wikimedia
Commons media files, and snippets of Wikipedia and other prose content
projects are re-used around the Web. Using a function does not need to be a
direct experience, but can also be embedded in another experience: for
example, it should be easy for a spreadsheet user to use a function from
Wikifunctions, or to allow third-party apps to use functions from
Wikifunctions, to have Wikifunctions functions be exposed through voice
interfaces or command lines, and many more ways.
*”and re-use”*: For many people and use cases the above direct calls will
be sufficient. But we cannot provide the computational resources for
everyone and for all use cases. So we must make it easy for a user to take
functions from Wikifunctions and re-use them, running them on computational
resources they provide, or embed them in completely new contexts. As with
all other Wikimedia projects, it should be very easy to download or export
code from Wikifunctions for use elsewhere, as well as the simpler direct
calls.
*“in the world’s natural”*: Wikifunctions aims to support all the languages
of our users – our editors, our functionaries, our translators, our
readers, our re-users, and all the people we don't yet reach. It will be
possible to run functions in Wikifunctions from an interface in the
language of the user, but also the natural language generation libraries in
Wikifunctions will target hundreds of languages.
*“and programming languages.”*: Wikifunctions will allow people to write
function implementations in a large number of programming languages.
Unfortunately, adding a programming language to Wikifunctions will always
be a bit of work, which will be somewhat of a bottleneck, and will require
us to stage the deployment of programming languages. We will start with
supporting Python and Javascript by the time Wikifunctions launches, but we
aim to support many other languages in a relatively short period of time.
Thanks to Moriel Schottlender
<https://meta.wikimedia.org/wiki/User:Mooeypoo> and Diana Montalion for
prompting the question, and everyone on the team and beyond for thinking
through it and helping with getting to a first draft of the mission
statement.
The on-wiki version of this week's newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-04-02
When we started the development effort towards the Wikifunctions site, we
subdivided the work leading up to the launch of Wikifunctions into eleven
phases <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases>, named
after the letters of the Greek alphabet.
This week, we completed the third phase, Phase γ (gamma)
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B3_(gam…>
.
With Phase α
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B1_(alp…>
completed, it became possible to create instances of the system-provided
Types in the wiki. With Phase β
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B2_(bet…>,
it became possible to create Types on-wiki and to create instances of these
Types.
The goal of Phase γ was to provide all the main Types of the pre-generic
function model
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Pre-generic_function_mod…>
— Function, Implementation, Tester, Function call, Error, and so on. Some
of these Types have quite a bit of 'magic' attached to them. For example,
the screenshot below shows a Function call. The component for the Function
call first asks you to choose one of the Functions defined on-wiki, then
reads the chosen Function definition in order to get the definition of the
Arguments of the Function, and finally creates the UX to allow you to enter
the argument values.
The screenshot for example shows a Function call to the Function “if”,
taking a condition (which is just a literal “true” here), and then
returning either the consequent (in this case the String “This result”) or
the alternative (in this case the String “Not this one”). Given that the
condition is true, this Function call would evaluate to “This result”.
In order to demonstrate that we indeed completed Phase γ, we created the
Function call from the screenshot, and it is stored as Z10006 on the
notwikilambda <https://notwikilambda.toolforge.org/wiki/ZObject:Z10006>
site. As ever, our thanks and a shout out to Lucas Werkmeister for setting
up and maintaining notwkilambda in a volunteer role!
We cannot yet evaluate a Function call, only write them. The next three
phases we will work on making evaluation possible in different forms. We
are also aware that the UX needs design work; we are intentionally keeping
the UX very basic for now, as we are in an early phase. We will have major
updates on that in the following few weeks.
Alongside the main new feature, there have been plenty of other
improvements throughout the code base. We made progress on the back-end
components that will respond to requests to evaluate the code. We improved
the WikiLambda extension in terms of refactoring the code, particularly
through a Data Access Object for ZObjects. The front-end is currently in
the middle of a refactor to support new features easier. We now display
some error messages in the UX, instead of failing silently. We have
components that allow entering code, for use in creating Implementations.
We improved considerably the experience and scalability when creating new
objects and assigning them a fresh ZID.
Outside the code changes themselves, we also had the Logo concept vote
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-03-17>,
and folks from the Security team and the Performance team each embedded
with us for a few weeks, as we prepare for readiness reviews with regards
to these aspects. Also the Outreachy interns finished their work
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-03-25> and
delivered their report. In fact, we closed 64 tasks in this phase - more
than in the previous two phases together.
Finally, there are also some bugs, and we are tracking known ones on the
Phabricator task board. We are still actively working on the Function call
component. If you encounter new bugs, please do raise them. Either let us
know, or create a new bug report in the column “To triage” so that we know
to review them.
We are now moving on to Phase δ (delta)
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B4_(del…>.
The goal of this phase is to allow evaluating those Function calls which
can be done via built-in implementations. The implementations of the
built-ins have already landed a little ahead of the phase start, so one
major goal of this phase will be to wire up the MediaWiki site with the
WikiLambda extension to the middleware function-orchestrator
<https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/function-…>
service. This will also make setting up an environment and running tests
more complicated, but of course closer to the reality of how Wikifunctions
will operate. At the end of this phase, we will be able to call one of the
built-in implementations, such as for the demonstration function call to
“if”.
This is a small change to the original plan. We had pulled the native
function calls forward
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-02-18>,
but we are now noticing that we pulled them one phase too early - they
should be the fifth, not the fourth phase. We will devote Phase δ to the
built-in implementations, and then allow for native function calls in Phase
ε, when we will wire the function executors
<https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/function-…>
up with the orchestrator. This will be followed by Phase ζ, which will
allow for function composition. These three upcoming phases (δ, ε, ζ) will
develop the three different ways to express implementations for functions,
which will be the beating heart of the technical features that
Wikifunctions will provide. There will still be tons of other things that
need to be developed and improved, without question — but these will be the
main steps towards providing a glimpse into what Wikifunctions will bring
to the Wikimedia movement and beyond.
I want to end with a big shout out to the whole team, and to the volunteers
who have contributed patches — in particular Arthur P. Smith, Gabriel Lee,
Lucas Werkmeister, Thiemo Kreuz, DannyS712 — and give a pointer to the Focus
languages submission process
<https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Focus_languages>.
The submission deadline is on 7 April. Interested people are welcome to add
your own submissions and attend the Q&A session regarding the Focus
languages process
<https://www.wikidata.org/wiki/Wikidata:Events/30_lexic-o-days_2021> which
is part of the 30 Lexic-o-days of Wikidata.