The on-wiki version of this newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-07-29
--
A few weeks ago, we described the problem that in the canonical
representation of objects for Wikifunctions, the representation of typed
lists was very verbose — or, more briefly, that "typed lists were too looong
<https://meta.wikimedia.org/wiki/Special:MyLanguage/Abstract_Wikipedia/Updat…>
".
In that post, we gave a quick overview of the problem. The team started to
form an opinion, and even though the problem was rather technical, we
decided to describe it in detail for the community
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Typed_lists>, discuss a
number of possible solutions, and describe their advantages and
disadvantages. We were hoping the community would help us solidify an
answer, and point out arguments we hadn’t considered. Even just the act of
describing the problem and the possible solutions in sufficient detail
would probably be beneficial for us, helping to converge on a solution.
That’s "rubber-ducking <https://en.wikipedia.org/wiki/Rubber_duck_debugging>"
in the large.
My expectations of the benefits of this approach were vastly over-achieved.
Instead of merely discussing the pros and cons of the different proposals,
which was already very helpful, members of the community even made new
suggestions. One community member, Benjamin Degenhart, suggested in the unified
Telegram / IRC chat
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia#Participate> a new
proposal, in which the first element of a list would declare the type of
the list. To be honest, my first thought was “cute, but too hacky”, but we
took and fully described the solution, and started discussing it internally.
In the end, the team changed its preliminary preference from one of the
previous proposals to Benjamin’s newly proposed solution, and we called the
solution Benjamin arrays, to honor the community member who made the
proposal.
Over the last few weeks we implemented Benjamin arrays, which turned out to
be a more extensive task than we originally anticipated. That wasn’t due to
the specific solution — almost all of the other solutions would have been
similarly complex — but rather due to the fact that it touched such a core
aspect of the WikiLambda function model
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Function_model>, and
that many places across a number of repositories were making assumptions
about how arrays were being represented.
As of last week, Genoveva Galarza Heredero, the lead of our Experience work
stream, declared the change in the function model to be fully complete, and
that Benjamin arrays are now fully supported and implemented within all
parts of the Wikifunctions architecture.
Deciding and implementing the canonical representation of typed lists is the
first of the goals of the current Phase θ
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-04-08> that
we have reached. We have been working in parallel on the other goals as
well, but this is the first goal we have fully delivered. We look forward
to achieving the other goals in the next few weeks.
Monthly volunteer corners
Starting on 1st August 2022, we will have monthly "volunteers' corner"
meetings. You can join us via Google Meet, but we will also be tracking the
IRC / Telegram channel. If you have any questions, or things you would like
to work on, or some time you would like to contribute towards the
development of Wikifunctions, please drop by. Volunteers' corner meetings
will go from 17:30-18:00 UTC, and be on the first Monday of every month.
More reading and listening
The Wikimedia Foundation Security team has published a blog post describing
their exploration of a new application security pipeline
<https://phabricator.wikimedia.org/phame/post/view/291/application_security_…>
that
works well with the planned transition from Gerrit to Gitlab. The
function-schemata component, which is part of the Abstract Wikipedia
architecture, was used as an example for the new pipeline.
Yaron Koren interviewed Julia Kieserman and Adam Baso of the Abstract
Wikipedia team for Episode 116 of his podcast Between the Brackets
<https://betweenthebrackets.libsyn.com/episode-116-adam-baso-and-julia-kiese…>.
Available in all the usual places for podcasts and very much worth a listen!
Workstream updates as of July 22, 2022
The week of July 15 was inspiration week, and will not be covered
individually.
Performance:
- Reproduced performance problems and will continue debugging the
slowness
- Met with SRE to discuss requirements for launch
- Completed end-to-end tests using the Beta Cluster
NLG:
- Iterated on design document describing the template language
- Implemented the template parser as part of the CLI tool and pending
code review
Meta-data:
- Completed needed modifications to existing Vue dialog components
- Finalized refinements to metadata dialog on function page
Experience:
- Officially completed all tasks pertaining to the following goal in the
scope of the current project phase: ‘We decide and implement a canonical
form for typed lists ("Benjamin arrays")’
- Finished and merged checkbox interactivity for table
- [DESIGN] Began research and explorations for the Default component
The on-wiki version of this newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-07-20
--
Today we are happy to announce new people joining the Abstract Wikipedia /
Wikifunctions team.
*Rebecca Wambua* is joining us as Senior Product Manager on the team. Her
first goal is to to accompany the launch and further development of
Wikifunctions. We are very excited to benefit from Rebecca’s energy and
enthusiasm! In Rebecca's own words:
Hey everyone!
My name is Rebecca.
I’m Kenyan by heritage and I live in Nairobi, Kenya.
I’m a product manager by profession and have most recently worked in
Microsoft as a product manager for Windows Admin Center and Windows
Containers & Bridge tooling. I am now working on Abstract Wikimedia. I’m
super excited to get working on Wikifunctions.
I love the mission of the Wikimedia Foundation to make knowledge available
to the whole world. I think it's noble and worthwhile to improve the lives
of other people and see the amazing things that knowledge-empowered people
can achieve.
Aside from work, I’m a happy dog mom. My dog is called Lilly and she’s 4
months old now and very playful. I spend most of my free time taking her
for walks and events. I also watch a ton of movies and listen to music as
well.
I’m an extrovert and so I love meeting new people and hearing people’s
experiences. This is definitely an amazing environment to be in and I look
forward to meeting and interacting with you all!
*Elena Tonkovidova* is Senior Quality and Testing Engineer with the
Wikimedia Foundation. We are hiring a team-specific QTE role
<https://boards.greenhouse.io/wikimedia/jobs/4321901?gh_src=0dcd8f191us>,
and we are very thankful to have Elena join us to help assess our needs and
eventually help integrate the new hire. She will help ensure a smooth
launch of Wikifunctions and a more consistent development of the project.
And in Elena's own words:
I've been with Wikimedia Foundation for almost eight years (saying 'hi' to
people on the team who know me already), working as a QA engineer with
different teams and on different projects. Each project I worked on -
VisualEditor, the Wikipedia Apps, Echo Notifications, Maps/Kartographer,
Structured data on commons, MediaSearch, ContentTranslation and
GrowthExperiments - has added a lot to my learning of the incredibly
complex picture of knowledge sharing challenges and to my understanding of
what our communities expectations are.
After graduating with a MS in Physics, I worked for many years as a system
engineer, also pursuing my life-long interest in education by teaching from
time to time high school and college level Math and Physics classes. Fast
forward - after moving to the US (moving then to Canada and back again) I
started to work as a QA engineer and, to my surprise, I found that my new
work ties together my two big passions - technology and teaching.
Reading other introductions of the team I found many things that resonated
with me, e.g. I was super impressed reading about critical pedagogy
<https://en.wikipedia.org/wiki/Critical_pedagogy>, especially since I'm
currently taking the Coursera uncommon sense teaching
<https://www.coursera.org/learn/uncommon-sense-teaching> course.
I will be happy to work on the Abstract Wikipedia project - the project
that makes free knowledge more accessible and more reliable.
We are also very happy to announce a second set of Google.org fellows
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-04-12> joining
us for six months, starting in July and going until December 2022. The
three fellows are:
*Sandy Woodruff*, UX Designer Fellow. Sandy (she/her) is a Bay Area-based
interaction designer with 9+ years of experience who strives to create
experiences that are both ethical and impactful. She has been at Google for
4.5 years on the Cloud AI & Industry Solutions UX team designing products
that make AI accessible to users with limited machine learning experience.
Outside of Google, Sandy loves helping others break into the field of UX
through career coaching. Before that, she designed at Etsy, Rent the
Runway, and Fab; advised for Cornell Tech’s incubator and launched a
startup that helped New York City residents develop better recycling habits.
*Dani de Waal*, Software Engineer Fellow. Dani (she/her) has been working
at Google for the past three years on the Ad Manager team. Prior to diving
into the world of code, she could be found singing for her supper as a
professional actor/musician. She is thrilled by the opportunity to take
part in the Google.org fellowship, working on the Abstract Wikipedia
project. Dani is located in New York.
*Edmund Wright*, Software Engineer Fellow. Edmund (he/him) has been a
Software Engineer at Google for six years. He started in New York, working
on the Google Ad Manager frontend, and since 2019 has worked in London on
the Android Google app, with some part time work as a Privacy Advocate.
Before Google, Edmund was pursuing (but did not complete!) a PhD in
Economics. He lives in London with his partner and two daughters.
Sandy, Dani, and Edmund will work in the challenging space of figuring out
the user experience of Wikifunctions and Abstract Wikipedia, a crucial
aspect to make sure that the project really is accessible to many people
with heterogeneous backgrounds. We are extremely grateful for Google.org
for their trust and continued support, and with their new fellows we are
hoping to continue making steadier progress toward achieving our ambitious
goals of making Abstract Wikipedia and Wikifunctions available to everyone.
*Workstream updates* as of July 8, 2022
Performance:
- Worked with SRE on adding uptime monitoring using Prometheus in Beta
cluster
- Completed migration of the tester pipeline from orchestrator into MW
- Updated Beta Cluster to complete migration to Benjamin arrays
NLG:
- Semi-Formal specification of the proposed template language
- Examined the possibility of integrating template parsing into the CLI
tool
Meta-data:
- Finished refining & testing metadata dialog
Experience:
- Completed full-stack testing to find typed list-related issues
- Mobile fixes on function page: Adapt sidebar, edit button
- Fixed and merged bug on labels in fallback language not showing and
Codex issue with CdxLookup component
- [DESIGN] Editing typed lists handoff
Abstract Wikipedia,
Hello. I thought of an idea for Wikifunctions which I would like to share.
Could Wikifunctions additionally utilize Wikidata as a backend and utilize some interface, like Canvas2D [1][2], to generate diagrammatic visualizations of data for Wikipedia articles?
Could Wikifunctions additionally be utilized to generate snapshot and real-time charts, figures, graphs, and so forth from Wikidata data for Wikipedia articles?
Best regards,
Adam Sobieski
[1] https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API
[2] https://d3js.org/
The on-wiki version of this newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-07-12
--
Welcome, Amin!
As Aiswharya has rotated
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-06-30> over
to the Trust and Safety Tool team, Amin Al Hazwani has joined us as our new
Designer. Amin has been working with open projects for a long time, and
joined the Wikimedia Foundation in August 2021. He previously worked with
the Trust and Safety Tool team.
As usual, I will let Amin introduce himself in his own words.
Hello everyone! My name is Amin, and I'm very excited to join Abstract
Wikipedia.
I’m an Italian designer with Syrian roots, currently living and working in
Milan, Italy. I have an itch for foreign languages, and that brought me to
Bolzano, a city surrounded by the Italian Alps, where I attended the
trilingual (English, German, and Italian) Faculty of Design and Art.
In 2010, I founded and ran my own design studio working between Bolzano,
Milan, and Berlin. Then, I joined Mozilla in Berlin, where I was one of the
founding members of the design systems initiative, a program that supported
the launch of the new Firefox in November 2017. Later I became a member of
the Firefox for Android team, working on experimental browsers. Together we
built and shipped Firefox for Amazon’s media players and smart displays.
In 2019, I moved back to Italy and worked at Accurat, a design and
development studio based in Milan and NYC focused on data visualizations.
We are looking forward to working with Amin and to seeing him shape
Wikifunctions. Please join me in welcoming Amin to the team!
Cory in residency
Our own Cory Massaro, software engineer at large at the Abstract Wikipedia
project and lead of the natural language generation workstream, is out for
a month-long art residency with arthereinstabul
<https://www.arthereistanbul.com/home> in Istanbul.
arthereistanbul is an organization founded in 2014 by Syrian artists. The
activities of the organization encompass a broader conversation between
artists and cultural practitioners regardless of nationality. The
organization maintains a space in İstanbul for artists to work, to promote
intercultural exchange, and to engage in activism.
Andrea Zambrano and Cory form the duo Tecnologías Silvestres. Andrea is a
painter who has worked with popular education, feminist activism, and
ecosystem conservation in Ecuador and Mexico on projects which denounce
violence towards women, girls, and trans people; exploitation by mining and
petroleum interests; and state and market policies which loot and exploit
communities.
Their project "UMBRAL" concerns the bridge / gateway / threshold as spaces
of encounter, transition, or tension between spaces and times. They
investigate how hegemonic and counter-hegemonic forces meet in bridge /
gateway spaces. They ask, "What is hegemony; who perpetrates it?" "How can
we strengthen counter-hegemonic forces?" "What are the hegemonic and
counter-hegemonic effects of emerging technologies?"
The platform the Abstract Wikipedia team is building intends to serve
Wikipedia's least-resourced language communities. It is thus foundationally
bound up in issues of digital colonialism and geographic, linguistic, or
cultural privilege, and has to adopt an anti-hegemonic approach in order to
succeed. This residency is a deliberate step to foster an egalitarian,
decolonial, and empathetic ecosystem.
While Cory is in Istanbul, and if you happen to be near enough: if you are
interested in Abstract Wikipedia, knowledge equity, curbing digital
colonialism, and / or a language which is underrepresented on Wikipedia,
he'd love to talk to you! You can reach out to him at cory.massaro(a)gmail.com
.
Other and Workstream updates
We are hiring for a Quality and Test Engineering position! Apply here
<https://boards.greenhouse.io/wikimedia/jobs/4321901?gh_src=0dcd8f191us>,
and for more information see the last newsletter
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-06-30>.
Apologies for the delays in the newsletter, but I was traveling and got
COVID.
Workstream updates (as of June 24)
Great progress on merging big changes on typed lists Performance:
- SRE ServiceOps have reviewed the Performance Metrics document and
approved the metrics
- Made great progress on end-to-end testing using the Beta Cluster and
migration of the tester pipeline from orchestrator into MediaWiki
NLG:
- Drafted initial design document describing templating language
Meta-data:
- Implemented supporting Vue / Vuex components for metadata dialog and
built initial presentation of metadata dialog (v. 0.1)
Experience:
- Merged cross-repo schemata+WikiLambda+orchestrator clean-up patches
- Merged v0 of tester and implementation table interfaces
- Finished text with fallback implementation for fallback title
- [DESIGN] Started work on editing typed lists
The on-wiki version of this newsletter can be found here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-06-21
--
Communities will create (at least) two different types of articles using
Abstract Wikipedia: on the one hand, we will have highly-standardised
articles based entirely on Wikidata, called model articles; and on the
other hand, we will have bespoke, hand-crafted content, assembled sentence
by sentence. Today we will discuss the second type, after we discussed the
first type, model articles, in a previous newsletter
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-06-07>.
Both types, by the way, can be implemented by the "templatic renderers"
concept that is part of Ariel Gutman’s proposal
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-05-27>. We
will also dedicate a future newsletter to a comparison of the two types.
For manually-assembled articles, we have to make many more assumptions
about what will eventually be available in Wikifunctions than we do for
model-based articles. The following description is not meant to prescribe
to the community how things should work, but provides just the sketch of a
possibility. It is based on a "Wizard of Oz experiment"
<https://en.wikipedia.org/wiki/Wizard_of_Oz_experiment> we did during our
recent Abstract Wikipedia team offsite
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-05-20>.
We took the first sentence from a semi-randomly chosen article, with the
aim to handcraft the representation of said sentence in Abstract Wikipedia.
It's often harder to see how to translate articles about ideas than more
concrete things like people, places, and objects. The sentence came from
the English Wikipedia article Profit (economics)
<https://en.wikipedia.org/wiki/Profit_(economics)>, which we picked as a
common example of a concept:
An economic profit is the difference between the revenue a commercial
entity has received from its outputs and the opportunity costs of its
inputs.
Note that we do not expect that English Wikipedia will be the source for
all articles for Abstract Wikipedia, but it is certainly a convenient
source of inspiration for the team, given that all of us speak English. As
a baseline, we each manually translated that text into the languages we
speak.
One powerful, if not the most powerful tool in our arsenal towards turning
this sentence into abstract content is that we can rewrite and simplify it.
In Abstract Wikipedia the goal is not to translate as faithfully as
possible the wording of any existing Wikipedia articles, but to capture as
much as possible of the meaning of the articles. So we took the freedom to
rewrite the sentence as follows:
In economics, the profit of a commercial entity is defined as the
difference between its outputs’ revenue and its inputs’ opportunity cost.
We further reduced the sentence, due to time constraints, as simply:
In economics, profit is defined as the difference between revenue and cost.
We then from this assembled the following abstract content.
*Context*
- *context*: economics <https://www.wikidata.org/wiki/Q8134>
- *content*: *Definition*
- *subject*: profit <https://www.wikidata.org/wiki/Q26911>
- *definition*: *Difference*
- *first*: income <https://www.wikidata.org/wiki/Q1527264>
- *second*: operating cost <https://www.wikidata.org/wiki/Q831940>
Here, the bold text is the label of a constructor, the italic text is the
label of a key of the given constructor, and the link points to a Wikidata
item. This follows the notation used in previous examples. Just as with
previous examples, we assume the availability of the used constructors. To
be explicit, in this case we assume the constructors listed below with
their respective keys. How the keys or constructors would be named, and in
fact, which constructors and keys would even exist, might very well be very
different.
*Context* returns a full clause representing a subordinate clause being put
in a context
- *context* take a noun phrase, describing the context in which the
content is
- *content* takes a clause that is being put in the context
*Definition* returns a full clause defining something as a definition
- *subject* takes a noun phrase that is being defined
- *definition* takes a noun phrase that represents the definition
*Difference* returns a noun phrase that means the quantitative difference
between two given noun phrases
- *first* takes a noun phrase that represents the first part
- *second* takes a noun phrase that represents the second part
Where we have mentioned "noun phrase" above, we actually mean "concept that
can be realized as a noun phrase by a renderer". Also, we have glossed over
the considerable challenge of having a mechanism through which a renderer
could just take in a Wikidata item and turn it into a noun phrase. That is
a challenge that Mahir has tackled admirably with Ninai and Udiron
<https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-09-03>.
Another challenge was to find the right Wikidata items for each of the
involved noun phrases. For example, for the second key of the Difference
constructor, we chose operating cost <https://www.wikidata.org/wiki/Q831940>.
Other candidates could have been cost
<https://www.wikidata.org/wiki/Q240673> or opportunity cost
<https://www.wikidata.org/wiki/Q185715>. Again, this is not necessarily the
best choice, but just the one we came up with, given our time constraints
and the way we approached the task.
The final step of the exercise was to take that abstract content, and to
render (by hand) a natural language text in the languages that we speak, as
mechanically as possible, using the labels of the selected Wikidata items
(it should be the lexeme connected to the items, but that was too sparse).
This step is why we called the whole exercise a “Wizard of Oz” exercise, as
we simulate here what renderers in Wikifunctions would do.
Here are some results (unfortunately, we didn’t record the results we came
up with during the offsite, so we re-created them for this newsletter):
*English*: In economics, economic profit is defined as the difference
between income and operating cost.
*German*: In Wirtschaftswissenschaft ist Gewinn definiert als der
Unterschied zwischen Einkommen und Betriebskosten.
*Croatian*: U ekonomiji, dobit je definiran kao razlika između dohodka i
troška*.
*Russian*: В экономике, экономическая прибыль определяется как разница
между доходом и операционными затратами.
*French*: En économie, le profit est défini comme la différence entre les
revenus et les dépenses d'exploitation.
*Spanish*: En economía, ganancia económica se define como la diferencia
entre ingresos y costes*.
*Kannada*: ಅರ್ಥಶಾಸ್ತ್ರದಲ್ಲಿ, ಆರ್ಥಿಕ ಲಾಭವನ್ನು ಆದಾಯ ಮತ್ತು ನಿರ್ವಹಣಾ ವೆಚ್ಚದ
ನಡುವಿನ ಅಂತರವೆಂದು ವ್ಯಾಖ್ಯಾನಿಸಲಾಗಿದೆ.
*Chinese*: 在经济学中,经济利润被定义为收入与经营成本之间的差额。
*Hebrew*:
בכלכלה, רווח מוגדר כהפרש בין הכנסה להוצאות תפעוליות.
*Swedish*: I nationalekonomi definieras vinst som skillnaden mellan inkomst
och Opex.
*Italian*: In economia, il profitto è definito come la differenza fra il
reddito e i costi operativi*.
*Arabic*:
في الاقتصاد*، يتم تعريف الربح على أنه الفرق بين الدخل المالي والمصروفات
الجارية.
Words marked with an asterisk were given manual translations from us, as
they did not at the time have a label in Wikidata, or the label did not fit.
During the offsite, we evaluated the results, and found them in fact not
only readable (although not perfect), but also easier to understand than
our initial translation. This is likely an effect of the simplification
process the text underwent. The whole exercise left us filled with optimism
about the approach.
*This newsletter was late due to the amount of discussion it generated
internally. Don’t expect everyone on the team to agree on everything being
said here. We think these discussions should be in the open, for everyone
to join in. Expect more to follow.*
*Further updates:*
We are getting additional support from ThisDot technical writers: Two
ThisDot technical writers will be joining the team for the remainder of
June to figure out how to on-board users into the concept of functions, and
how to communicate to users what functions are and how they work, in an
easily-translatable manner.
Below is the brief weekly summary highlighting the status of each workstream
Performance:
- Drafted the Performance Metrics document
- Started research on reported slowness in function evaluation
- Added logging and dashboarding to Beta Cluster and wrote documentation
for Beta Cluster
NLG:
- Wrote a Proof of Concept of support for new Wikifunctions features to
support proposed NLG pipelines
Meta-data:
- Altered MediaWiki PHP and Vue layers to handle either format
- Ensured that no function-orchestrator test code/cases employ the old
format
Experience:
- WikiLambda PHP and Function-schemata finished and merged
- Design: continue working on typed list view
- Front-end: made ISO codes mobile friendly and started table component
implementation