The on-wiki version of this newsletter is here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2020-12-09
--
One of the great things in the Wikimedia movement is our opportunity to
partner with open source and open access communities across the Internet.
Today we'd like to introduce you to two new members of our effort who join
us via such a partnership.
The Wikimedia Foundation regularly partners with the Outreachy program
<https://www.outreachy.org/>, which offers three-month internships to work
remotely in Free and Open Source Software ("FOSS") with experienced
mentors. The program is open to both students and non-students, and
encourages applications from people who face under-representation, systemic
bias, or discrimination in the technology industry of their country.
We have the good fortune in Abstract Wikipedia to welcome two Outreachy
interns this round! Please join me in welcoming Aisha and Jade, who will be
focusing on analysing and understanding how templates and Lua modules are
used across Wikimedia wikis. This knowledge will be crucial in helping us
work out how the central wiki of functions can best support existing
community use cases, alongside the novel features that Abstract Wikipedia
will also provide. There's more detail on the outline project definition on
Phabricator <https://phabricator.wikimedia.org/T263678>.
In their own words:
*Aisha*
I completed my Computer Science undergraduate degree in February 2020, then
worked as a Machine Learning Engineer for 7 months before I joined
Wikimedia as an Intern. I am passionate about data and natural language
processing, to which I was introduced through my undergraduate thesis
research. I cook as a hobby and someday I wish to travel to far lands and
experience lots of cultures.
Communities have inspired me and have been my source of motivation for
quite some time. I owe my skills and knowledge to them and I am
enthusiastic to keep the process of dissemination going. And what is a
better place than an open-source community where free and open information
is not only celebrated but people work towards making it possible for
everyone else? This makes software and data so much more organic and gives
it the kind of atmosphere I love hanging around in. I am so glad to have
joined Wikimedia! Hoping to learn a lot and make valuable contributions!
*Jade*
I am a CS student, who will most likely finish the Bachelor's degree in the
summer of 2021.
While I was studying in school, for a long time, I was thinking that
linguistics would be my future - but learning to code through high school
made me choose another path. I felt that creating new things and easily
solving problems with programs created by yourself is really exciting, so I
stick to that decision.
In the University I was lucky enough to find the lecturer whose main
interest was computer linguistics, so I was able to combine both of my
passions, creating projects about predicting the style of the text (which
has far less flexible classification in Russian than in English) and about
translating poems to Interslavic.
Reading Abstract Wikipedia papers made me feel really enthusiastic, as it
showed me the really different point of view to the translation problems.
You don't have to strive for variety - but for simplicity, unambiguity and
richness of the language at the same time. And what can be better if this
initiative will help many more people find the knowledge they are looking
for?
We're so pleased to have the opportunity to work with Aisha and Jade, and
look forward to their contributions. You'll see them on IRC and on their
project repository
<https://github.com/wikimedia/abstract-wikipedia-data-science> (just set
up) in the coming months. Please do say hi!
Aisha and Jade will be blogging about their experience. We encourage you to
have a read of their first posts about their application for Outreachy
internship:
- Aisha's blog
<https://tanny411.github.io/2020/12/01/Getting-Started-With-Outreachy/>
- Jade's blog
<https://dev.to/lost_enchanter/scaredy-cat-s-guide-to-outreachy-admission-20…>
While you're at it, you may want to add the reports page
<https://www.mediawiki.org/wiki/Outreachy/Round_21/Bi-weekly_Reports> to
your watchlist, too, and see the fine work from all of our Outreachy
interns.
Thank you, Aisha and Jade, for joining us!
--The Abstract Wikipedia Team
The on-wiki version can be found here:
https://meta.wikimedia.org/wiki/Special:MyLanguage/Abstract_Wikipedia/Updat…
* * *
Today we will talk a little bit about how the catalog of functions will
actually work. A more formal and comprehensive description of this is
the function
model
<https://meta.wikimedia.org/wiki/Special:MyLanguage/Abstract_Wikipedia/Wiki_…>,
but here we want to create a more intuitive explanation.
Every page in the new wiki will contain one *Object*. Every Object is of
exactly one *Type*. Types decide what the Object means. Types also decide
what makes an Object valid or not. An Object that is of a certain Type is
called an *instance* of that Type. Every Object that is stored as a wiki
page is called a *Persistent Object*, and such an Object has a Z-ID by
which it can be referenced. The Z-ID is the name of the page.
What does it mean that the Type decides what the Object means? Let’s look
at an example. An Object of Type *Positive Integer* with the value
"2020" represents
the number 2020. That’s the number you get when you multiply 4, 5 and 101.
And because it has the Type of Positive Integer, we know it is a number. If
the Type of "2020" was String, then that wouldn’t represent the number
2020, but rather a sequence of four characters, "2", "0", "2", and 0. If
the Type of it was *Gregorian Calendar Year* then this would represent the
year 2020 in that calendar, a leap year starting with a Wednesday and
ending with a Thursday.
The Type decides what you can do with an Object -- or, more formally
speaking, which Functions you can apply to the given Object. If you have a
String, you can count the length of the String, *i.e.* how many characters
are in the sequence. The result for "2020" above would be four. If you have
a Gregorian Calendar Year, you might also be able to ask for the 'length',
but the result should probably be 366 days (as it is a leap year). If you
have a Positive Integer and ask for its length, you might be asking for the
logarithm in base 10, or you might be asking for the multiplicative
persistence <https://en.wikipedia.org/wiki/Persistence_of_a_number>. But
all of these would be different Functions (that may well have the same
label, “length”, in English; their descriptions would explain in more
detail what they would do). The Function states which Types it can take,
and the Type decides what an Object means.
Types can be very diverse. Some Types might have a small, explicitly
defined number of instances. These are called *Enumerations*, because all
the instances of the Type are known in advance and enumerated, and each
instance must be one of these few possible values. The classic example of
such a Type is *Boolean*, named after George Boole
<https://en.wikipedia.org/wiki/George_Boole>, which has exactly two
instances: *True* and *False*. Another Enumeration would be the *Days of
the Week*. Other Types can be simple but have an infinite number of
possible instances. Examples of such Types are *String* for sequences of
characters, and *Positive Integer* for the counting numbers. Other Types
can be composed of simpler Types. For example, one way to represent an
*Integer* would be to compose it from a Positive Integer and a Boolean,
where the Boolean decides whether the Integer is positive or negative, and
the Positive Integer represents the absolute value.
Not every Object has to be stored as a wiki page. For example, there is no
need to store the number 2020 as an Object in the wiki, as it can be easily
just created on the fly.
This is how Objects are built and represented. Objects of almost all Types
are called *Literals*. A Literal is an Object that, when evaluated, results
in itself. For example, when you evaluate the number 2020, the result is
the number 2020. But there are two very special Types whose instances are
not Literals, and these two types are *References* and *Function Calls*.
References are a special Type of Object that refers to a Persistent Object
by its Z-ID. It basically just says “here, this Reference, should really be
just this Object that is referenced to by this Z-ID”. Evaluating a
Reference results in the referenced Object. A Reference does not need to be
evaluated immediately, but can be evaluated whenever needed. In fact, it is
sometimes impossible to evaluate all References fully, as it can easily
lead to a recursion and infinite objects.
The other special Type of Object is the Function Call. A Function Call
consists of a *Function* and a *List of Arguments*. Evaluating a Function
Call is the core 'magic' of the whole system: it is basically replacing a
Function Call with the result of that Function Call, *e.g.* evaluation
would be replacing the Function Call to the Function with the label
"length" and the argument "2020" of Type String with the value "4" of Type
Positive Integer.
Function Calls are usually created on the fly. We wouldn’t, in general,
create a new wiki page with a Persistent Object in which we store the
Function Call described above, but rather we would send that Function Call
to an evaluation engine which, in turn, evaluates the Function Call and
returns the result. There is really not much more to the system than that.
A Function Call is in many ways analogous to using a MediaWiki template
<https://www.mediawiki.org/wiki/Help:Templates> with parameters.
I hope that this explanation made some sense and helped with understanding
the planned system. We are currently developing the part of the system that
allows contributors to create Types inside the wiki itself, so that the
community can create and control which Types exist. Please feel free to ask
questions.
------------------------------
The voting for the final round of the naming contest
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wiki_of_functions_naming…>
is
ongoing. Voting will be open until Monday, November 16. If you haven’t
already, please vote here and decide on a name for the new project where
all of these Objects we described above will be available. The proposals we
are voting on are, in reverse alphabetical order: *Wikimedia Functions*,
*Wikilambda*, *Wikifusion*, *Wikifunctions*, *Wikicodex*, and *Wikicode*.
Let us know your preferences and decide on the name of the new project by
Monday!
The newsletter is also available on-wiki here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2020-12-02
--
We are very happy to welcome Genoveva Galarza Heredero who is joining the
Abstract Wikipedia team as a software engineer.
Genoveva finished her Software Engineering degree in Madrid in 2009, and
since then has worked in various places. She worked for some time in
Argentina and then moved to India, where she stayed for five years, first
working on the architecture of expert systems that leveraged NLP and Linked
Open Data. Then she started her own social venture, an effort to move
Indian IT hubs to more eco-friendly ways of commuting. She now lives in
Madrid again, and after working for some years as a full-stack developer in
projects on agriculture, green energy, and data privacy, has joined the
Wikimedia Foundation.
She is passionate about art and free culture online, and has used her hobby
as a comic artist to work along the Spanish and Latin American webcomic
community to create faneo.es, a non-profit project led by its members that
attempts to provide an open and free space to publish, read, and even
create comics collaboratively.
Let’s share a few of her words:
“I got into IRC and Usenet when I was a kid: I discovered communities,
became part of them and ultimately learned how to code with the help of
others who shared their knowledge online. I can honestly say that
everything that I am today - a programmer, a comic creator, even a nature
lover or a linguistics geek - I owe it to the availability of free and open
knowledge and culture online. I have seen in my own flesh how incredibly
self-defining it is to have access to local and global knowledge, so I am
really excited to join the Wikimedia movement and, as we say in Spanish,
put my little grain of sand to make this possible for others. Hopefully
I’ll be learning a lot more about how language works, about bigger
communities, and about defying the for-profit approach to progress and
succeeding! I now have a lot of catching up to do, but I’m looking forward
to working with all of you and hopefully making meaningful contributions
soon.”
Genoveva will also expand the project’s timezones, so we can have
discussions with volunteers from more global regions in real-time. Given
that she is joining James Forrester as the second software engineer on the
team, she will be working on all the parts of the system for now: storage,
display, and editing on the wiki, the back-end evaluation engine service
and related APIs, the general UX, etc.
As the team grows, we will also start to implement more structure and
process in the development. We set up a new IRC channel at
#wikipedia-abstract-techconnect
<https://webchat.freenode.net/?channels=#wikipedia-abstract-tech> that will
(but does not yet) track changes to the source code and to the task boards.
Other technical streams will also be reflected on that. You are all welcome
to join this new IRC channel.
Please join me in welcoming Geno to the team as we are continuing our work
on the new Wikimedia project.