The on-wiki version of this newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2023-04-19
--
Selecting the right implementation
Functions in Wikifunctions can have more than one implementation. For
example, if we have a function that capitalizes the first letter
<https://wikifunctions.beta.wmflabs.org/wiki/Z10577> of a word, we can have
several implementations, e.g. one
<https://wikifunctions.beta.wmflabs.org/wiki/Z10711> or two in Python
<https://wikifunctions.beta.wmflabs.org/wiki/Z10713>, one in JavaScript
<https://wikifunctions.beta.wmflabs.org/wiki/Z10712>, and one using
composition <https://wikifunctions.beta.wmflabs.org/wiki/Z10579>. You might
find some of the implementations surprising. We previously discussed why we
made the design choice to allow multiple implementations
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-06-17> for
a single function.
Until recently, Wikifunctions selected an implementation at random.
Meaning, whenever someone was calling a function and there were multiple
implementations available, Wikifunctions would select the implementation to
be used randomly.
Implementations of the same function can have wildly different runtime
behavior. Some can be very slow, and others can be very fast: sorting a
list of 100,000 random numbers using bubble sort
<https://en.wikipedia.org/wiki/Bubble_sort> can take a minute on a current
processor, but with quicksort <https://en.wikipedia.org/wiki/Quicksort> the
same list of numbers can be sorted in less than two hundredth of a second -
faster than the blink of an eye. Much faster.
In Wikifunctions, functions should be accompanied by testers. The
capitalization function we talked about earlier has only one tester
<https://wikifunctions.beta.wmflabs.org/wiki/Z10578> as this is being
written, that checks that capitalizing the word “test” returns “Test”. If
all goes well, Wikifunctions will run each tester on each implementation.
The results of these tests are stored: does the implementation pass, how
many resources does it require, and other meta-data. This run-time
information is also shown to the user in a pop-up on request, for people
interested in the back-end details.
Wikifunctions now ranks the implementations based on this meta-data, and
updates the internal order of the implementations. Test failures result in
downgrades, and quick results lead to a better ranking. And so, for the
last few weeks, instead of selecting an implementation at random, we now
select the first implementation based on that ranking. Here is an example
of that reordering
<https://wikifunctions.beta.wmflabs.org/w/index.php?title=Z10577&diff=prev&oldid=4357>
working
in practice (but alas, diffs are not implemented yet).
This should lead to a considerable reduction in used resources, and to a
more consistent behavior of Wikifunctions. Function calls should produce
timeouts less often. This should also relieve the Wikifunctions community
from worrying about inefficient implementations and whether we should
accept them or not. Often, algorithms which are simpler are easier to read
and verify, but are slower: bubble sort is a good example of this, compared
with quicksort. Bubble sort is generally regarded to be much easier to
explain and understand than quicksort. Having both allows for the results
of the simpler implementation to be compared to results of the more complex
implementation, with both passing the same suite of testers, and thus
increase our confidence in the overall system. At the same time, we can in
practice use the more efficient implementation and thus reduce overall
resource usage.
With this, the first version of a major element that will work behind the
scenes of Wikifunctions has been put into place, and we have delivered
another goal of the current phase.
Maria Keet’s reflection on Abstract Wikipedia so far
Maria Keet <http://www.meteck.org/> has been an active and central part of
the Natural Language Generation Workstream. She is a professor at the
University of Cape Town, South Africa, and her collaboration with Ariel
Gutman
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-12-19> on
the template language
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Template_Language_for_Wikifunctions>
and
her arguments have been mentioned in the fellows’ evaluation
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Google.org_Fellows_evaluation>
and
the answer
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Google.org_Fellows_evaluation_-_Answer>.
Maria has now written down her own reflections and published them on her
blog:
keet.wordpress.com/2023/03/14/some-reflections-on-designing-abstract-wikipe…
The text is very accessible, gives context, and explains some of the issues
that low resource languages face, and makes suggestions on how to proceed.
Maria also describes some of the frustrating challenges she encountered in
having her voice heard and recognized. That part makes for a painful read,
and points to necessary changes.
To repeat her closing words:
The mountain we’ll keep climbing, be it with or without the Abstract
Wikipedia project. If Abstract Wikipedia is to become a reality and
flourish for many languages soon, it needs to allow for molehills,
anthills, dykes, dunes, and hills as well, and with whatever flowers
available to set it up and make it grow.
We are thankful to Maria for her ongoing contributions. We hope that we can
achieve a more inclusive space, with the goal to have contributing become a
more wholesome experience.
Talk about Abstract Wikipedia in Sweden
Professor Aarne Ranta <https://www.cse.chalmers.se/~aarne/> will give a
talk on Natural Language Generation and Abstract Wikipedia on Thursday,
April 20th, 2023 at 17:30 local time, in the Maritime Museum and Aquarium
<https://sv.wikipedia.org/wiki/Sj%C3%B6fartsmuseet_Akvariet> in Göteborg,
Sweden. The in-person event is free for the public. The talk will be given
in Swedish.
You can find more information about the talk in Swedish here:
https://www.vetenskapsfestivalen.se/for-alla/kunskap-utan-granser-abstract-…