The on-wiki version of this newsletter can be found here:
Communities will create (at least) two different types of articles using
Abstract Wikipedia: on the one hand, we will have highly-standardised
articles based entirely on Wikidata, called model articles; and on the
other hand, we will have bespoke, hand-crafted content, assembled sentence
by sentence. Today we will discuss the second type, after we discussed the
first type, model articles, in a previous newsletter
Both types, by the way, can be implemented by the "templatic renderers"
concept that is part of Ariel Gutman’s proposal
will also dedicate a future newsletter to a comparison of the two types.
For manually-assembled articles, we have to make many more assumptions
about what will eventually be available in Wikifunctions than we do for
model-based articles. The following description is not meant to prescribe
to the community how things should work, but provides just the sketch of a
possibility. It is based on a "Wizard of Oz experiment"
<https://en.wikipedia.org/wiki/Wizard_of_Oz_experiment> we did during our
recent Abstract Wikipedia team offsite
We took the first sentence from a semi-randomly chosen article, with the
aim to handcraft the representation of said sentence in Abstract Wikipedia.
It's often harder to see how to translate articles about ideas than more
concrete things like people, places, and objects. The sentence came from
the English Wikipedia article Profit (economics)
<https://en.wikipedia.org/wiki/Profit_(economics)>, which we picked as a
common example of a concept:
An economic profit is the difference between the revenue a commercial
entity has received from its outputs and the opportunity costs of its
Note that we do not expect that English Wikipedia will be the source for
all articles for Abstract Wikipedia, but it is certainly a convenient
source of inspiration for the team, given that all of us speak English. As
a baseline, we each manually translated that text into the languages we
One powerful, if not the most powerful tool in our arsenal towards turning
this sentence into abstract content is that we can rewrite and simplify it.
In Abstract Wikipedia the goal is not to translate as faithfully as
possible the wording of any existing Wikipedia articles, but to capture as
much as possible of the meaning of the articles. So we took the freedom to
rewrite the sentence as follows:
In economics, the profit of a commercial entity is defined as the
difference between its outputs’ revenue and its inputs’ opportunity cost.
We further reduced the sentence, due to time constraints, as simply:
In economics, profit is defined as the difference between revenue and cost.
We then from this assembled the following abstract content.
- *context*: economics <https://www.wikidata.org/wiki/Q8134>
- *content*: *Definition*
- *subject*: profit <https://www.wikidata.org/wiki/Q26911>
- *definition*: *Difference*
- *first*: income <https://www.wikidata.org/wiki/Q1527264>
- *second*: operating cost <https://www.wikidata.org/wiki/Q831940>
Here, the bold text is the label of a constructor, the italic text is the
label of a key of the given constructor, and the link points to a Wikidata
item. This follows the notation used in previous examples. Just as with
previous examples, we assume the availability of the used constructors. To
be explicit, in this case we assume the constructors listed below with
their respective keys. How the keys or constructors would be named, and in
fact, which constructors and keys would even exist, might very well be very
*Context* returns a full clause representing a subordinate clause being put
in a context
- *context* take a noun phrase, describing the context in which the
- *content* takes a clause that is being put in the context
*Definition* returns a full clause defining something as a definition
- *subject* takes a noun phrase that is being defined
- *definition* takes a noun phrase that represents the definition
*Difference* returns a noun phrase that means the quantitative difference
between two given noun phrases
- *first* takes a noun phrase that represents the first part
- *second* takes a noun phrase that represents the second part
Where we have mentioned "noun phrase" above, we actually mean "concept
can be realized as a noun phrase by a renderer". Also, we have glossed over
the considerable challenge of having a mechanism through which a renderer
could just take in a Wikidata item and turn it into a noun phrase. That is
a challenge that Mahir has tackled admirably with Ninai and Udiron
Another challenge was to find the right Wikidata items for each of the
involved noun phrases. For example, for the second key of the Difference
constructor, we chose operating cost <https://www.wikidata.org/wiki/Q831940>.
Other candidates could have been cost
<https://www.wikidata.org/wiki/Q240673> or opportunity cost
<https://www.wikidata.org/wiki/Q185715>. Again, this is not necessarily the
best choice, but just the one we came up with, given our time constraints
and the way we approached the task.
The final step of the exercise was to take that abstract content, and to
render (by hand) a natural language text in the languages that we speak, as
mechanically as possible, using the labels of the selected Wikidata items
(it should be the lexeme connected to the items, but that was too sparse).
This step is why we called the whole exercise a “Wizard of Oz” exercise, as
we simulate here what renderers in Wikifunctions would do.
Here are some results (unfortunately, we didn’t record the results we came
up with during the offsite, so we re-created them for this newsletter):
*English*: In economics, economic profit is defined as the difference
between income and operating cost.
*German*: In Wirtschaftswissenschaft ist Gewinn definiert als der
Unterschied zwischen Einkommen und Betriebskosten.
*Croatian*: U ekonomiji, dobit je definiran kao razlika između dohodka i
*Russian*: В экономике, экономическая прибыль определяется как разница
между доходом и операционными затратами.
*French*: En économie, le profit est défini comme la différence entre les
revenus et les dépenses d'exploitation.
*Spanish*: En economía, ganancia económica se define como la diferencia
entre ingresos y costes*.
*Kannada*: ಅರ್ಥಶಾಸ್ತ್ರದಲ್ಲಿ, ಆರ್ಥಿಕ ಲಾಭವನ್ನು ಆದಾಯ ಮತ್ತು ನಿರ್ವಹಣಾ ವೆಚ್ಚದ
ನಡುವಿನ ಅಂತರವೆಂದು ವ್ಯಾಖ್ಯಾನಿಸಲಾಗಿದೆ.
בכלכלה, רווח מוגדר כהפרש בין הכנסה להוצאות תפעוליות.
*Swedish*: I nationalekonomi definieras vinst som skillnaden mellan inkomst
*Italian*: In economia, il profitto è definito come la differenza fra il
reddito e i costi operativi*.
في الاقتصاد*، يتم تعريف الربح على أنه الفرق بين الدخل المالي والمصروفات
Words marked with an asterisk were given manual translations from us, as
they did not at the time have a label in Wikidata, or the label did not fit.
During the offsite, we evaluated the results, and found them in fact not
only readable (although not perfect), but also easier to understand than
our initial translation. This is likely an effect of the simplification
process the text underwent. The whole exercise left us filled with optimism
about the approach.
*This newsletter was late due to the amount of discussion it generated
internally. Don’t expect everyone on the team to agree on everything being
said here. We think these discussions should be in the open, for everyone
to join in. Expect more to follow.*
We are getting additional support from ThisDot technical writers: Two
ThisDot technical writers will be joining the team for the remainder of
June to figure out how to on-board users into the concept of functions, and
how to communicate to users what functions are and how they work, in an
Below is the brief weekly summary highlighting the status of each workstream
- Drafted the Performance Metrics document
- Started research on reported slowness in function evaluation
- Added logging and dashboarding to Beta Cluster and wrote documentation
for Beta Cluster
- Wrote a Proof of Concept of support for new Wikifunctions features to
support proposed NLG pipelines
- Altered MediaWiki PHP and Vue layers to handle either format
- Ensured that no function-orchestrator test code/cases employ the old
- WikiLambda PHP and Function-schemata finished and merged
- Design: continue working on typed list view
- Front-end: made ISO codes mobile friendly and started table component