[Abstract-wikipedia] Wiki of functions (was Re: Runtime considerations: Introducing GraalEneyj)

16 Jul 2020


      Thanks for the great discussion!
Renaming the topic as it deviated a bit from Lucas' original announcement,
and was questioning whether we should do Wikilambda at all. So, this is
mostly to answer Gilles' comment as to why a collection of functions and
tests needs to be editable on a wiki - that's the premise of the first part
of the project.
So yes, we could maybe create the whole code for the renderers in a
classical way, as an Open Source project, going through git and gerrit, and
assume that we will have hundreds of new coders working on this, neatly
covering all supported languages, to keep up with the creation of renderers
when the community creates new constructors, etc. I have doubts. I want to
drastically reduce the barrier to participate in this task, and I think
that this is, in fact, necessary. And this is why we are going for
Wikilambda, a wiki of functions first, as said in my first email to the
list.
Amir, when I say function I mean a computation that accepts an input and
returns an output. That differs, for example, from Lua modules, which are
basically a small library of functions and helpers. We're also starting
with no side-effects and referential transparency, because we're boring,
and leave that for a bit later.
So, there will be wikipages that represent such a function. So on one page
we would say, "so, there's this function, in English it is called
concatenate, it takes two strings and returns one, and it does this".
Admittedly, not a very complicated function, but quite useful. Then we
might have several implementations for this function, for example in
different programming languages, we also might have tests for this
function. All implementations should always behave exactly the same. An
evaluation engine decides which implementations it wants to use (and, for
the result it shouldn't matter because they all should behave the same),
but some engines might prefer to run the JavaScript implementations, others
might run Python, others might decide "oh, wait, I am composing two
different functions together in order to get to a result, maybe it is
smarter if I pick two implementations with the same language, as I actually
can compile that into a single call beforehand", etc. There are plenty of
possibilities to improve and optimize evaluation engines, and if you were
asking me to dream I'd say I am seeing a vibrant ecosystem of evaluation
engines from different providers, running in vastly different contexts,
from the local browser to an open source peer-to-peer distributed computing
project.
What looks like a programming language of its own in the existing ZObjects
(but isn't) is the ability to compose functions. So, say, you want to now
introduce a double_string function which takes one strings and returns one,
you could either implement it in any of the individual programming
languages, or you can implement it as a composition, i.e.
double_string(arg1) := concatenate(arg1, arg1). Since you can have more
implementations, you can still go ahead and write a native implementation
in JavaScript or Python or WebAssembler, but because we know how this
function is composed from existing functions, the evaluation engine can do
that for you as well. (That's an idea I want to steal for the natural
language generation engine later too, but this time for natural languages)
By exposing these functions, the goal is to provide a library of such
functions that anyone can extend, maintain, browse, execute, and use for
any purpose. Contributors and readers can come in and compare different
implementations of the same function. They can call functions and evaluate
them in a local evaluation engine. Or in the browser (iv supported). Or on
the server. That's the goal of the wiki of functions, what we call
Wikilambda in the proposal - a new Wikimedia project that puts functions
front and center, and makes it a new form of knowledge asset that the
Wikimedia movement aims to democratize and make accessible.
One goal is to vastly lower the barrier to contribute the renderers that we
will need for the second part of the project. But to also make it clear:
supporting the rendering of content and the creation of content for the
Wikipedias is our first priority, but it does not end there. Just as
Wikidata and Commons have the goal to support the Wikipedias (and within
these communities, the prioritization of this goal compared to other goals
may differ substantially from one contributor to the next), Wikilambda will
have that as a primary goal. But we would be selling ourselves short if we
stopped there. There are a number of interesting use cases that will be
enabled by this project, many of which are aligned with our vision of a
world in which everyone can share in the sum of all knowledge - and a
number of the 2030 strategic goals.
You may think, that is tangential, unnecessary, overly complicated - but
when discussing the overall goal of providing content in many languages
with a number of senior researchers in this area, it was exactly this
tangent that, frequently, made them switch from "this is impossible" to
"this might work, perhaps".
There have been a number of other topics, raised in the thread, that I also
want to addess.
Regarding atomicity of updates by Gilles, which was answered by Arthur
that, if you're changing the functionality, you should create a new
function (that's similar in spirit to the content-addressable code
mentioned by Amirouche), and that's where we'll start (because I am a
simple person, and I like to start with simple things for the first
iteration). We might identify the need for a more powerful versioning
scheme later, and we'll get to it then. I hope to avoid that.
Will the Lua be the same as in the Scribunto modules?, as asked by Amir.
We'll get to that when we get to implementations in Lua. I think it would
be desirable, and I also think it might be possible, but we'll get to it
later. We won't start with Lua.
As Arthur points out, the AbstractText prototype is a bit more complex than
necessary. It is a great prototype to show that it can work - having
functions implemented in JavaScript and Python and be called to evaluate a
composed function. It is not a great prototype to show how it should work
though, and indeed, as an implementation it is a bit more complex than what
we will initially need.
Regarding comments as asked by Louis, every implementation will come with
the ability to have documentation. Also, the implementations written in
programming languages will have the usual ability to have comments.
Regarding not doing it on MediaWiki, as suggested by Amirouche - well, I
had the same discussions regarding Wikidata and Semantic MediaWiki, and I
still think it was the best possible decision to build them on top of
MediaWiki. The amount of functionality we would need to reimplement
otherwise is so incredibly large, that a much larger team would be needed,
without devoting actually more resources towards solving our novel problems.
(I am soooo sorry for writing such long emails, but I think it is important
in this early stage to overshare.)
On Wed, Jul 15, 2020 at 10:42 AM Amirouche Boubekki <
amirouche.boubekki@gmail.com> wrote:
...
Le mer. 15 juil. 2020 à 17:00, Gilles Dubuc gilles@wikimedia.org a
écrit :
...
The part I don't get is why a collection of functions and tests needs to
be editable on a wiki. This poses severe limitations in terms of versioning
if you start having functions depend on each other and editors can only
edit one at a time. This will inevitably bring breaking intermediary edits.
It seems like reinventing the wheel of code version control on top of a
system that wasn't designed for it. MediaWiki isn't designed with the
ability to have atomic edits that span multiple pages/items. Which is a
pretty common requirement for any large codebase, which this set of
functions sounds like it's posed to become.
I disagree with the fact that it must be done with the existing
wikimedia codebase. Tho, I believe it is possible to build an
integrated development environment that fuses programming language
with a version control system. That is what unison is doing with
content-addressable code. Unison has different goals, but they have
i18n identifiers/docstrings on their roadmap.
...
What reasons does this benefit from being done as wiki editable code
compared to a software project (or series of libraries/services) hosted on
a git repository? Especially if we're talking about extensive programming
and tests, whoever is computer literate enough to master these concepts is
likely to know how to contribute to git projects as well, as they would
have very likely encountered that on their path to learning functional
programming.
My understanding is that wikilambda wants to be easier than going
through all the necessary knowledge to contribute to a software
project. That is, wikilambda could be something better than
git/github.
...
I get why such an architecture might be a choice for a prototype, for
fast iteration and because the people involved are familiar with wikis. But
what core contributors are familiar with and what can be used for a fast
prototype is rarely a suitable and scalable architecture for the final
product.
My understanding is that the prototype wanted to demonstrate that it
is possible to i18n the identifiers, docstrings, etc... of a
programming language. Something that nobody tried or succeed at doing
so far.
...
There is also a notion of how dynamic the content is to take into
account. There is a difference between the needs of ever-changing
encyclopedic information, which a wiki is obviously good at handling, and
an iterative software project where the definition of a function to
pluralize some text in a language is highly unlikely to be edited on a
daily basis, but where interdependencies are much more likely to be an
important feature. Something that code version control is designed for.
The problem with git is that if you change the definition of function
in a file, every other place where you call that function, you need to
update the code. That is different from a content-addressable code
where you reference the content of the function, if you change the
function, it creates a new function (possibly with the same name) but
old code keeps pointing to the old definition. That much more scalable
than the current git/github approach (even if you might keep calling
old code).
...
On Wed, Jul 15, 2020 at 4:42 PM Arthur Smith arthurpsmith@gmail.com
wrote:
...
...
On Wed, Jul 15, 2020 at 8:15 AM Amir E. Aharoni <
amir.aharoni@mail.huji.ac.il> wrote:
...
...
...
I keep being confused about this point: What are the "functions" on
AW/Wikilambda, and in which language will they be written?
...
...
I think everybody has a slightly different perspective on this. I've
been working closely with Denny's prototype project (which included the
javascript version of 'eneyj' that Lucas was inspired by) at
https://github.com/google/abstracttext/ (I assume this will move
somewhere under wikimedia now?). The prototype does to an extent define its
own "language" - specifically it defines (human) language-independent "Z
Objects" (implemented as a subset of json) which encapsulate all kinds of
computing objects: functions, tests, numeric values, strings, languages
(human and programming), etc.
...
...
I think right now this may be a little more complex than is actually
necessary (are strict types necessary? maybe a 'validator' and a 'test'
should be the same thing?); on the other hand something like this is needed
to be able to have a wiki-style community editable library of functions,
renderers, etc. and I really like the underlying approach (similar to
Wikidata) to take all the human-language components off into label
structures that are not the fundamental identifiers.
...
...
ZObjects for "implementations" of functions contain the actual code,
along with a key indicating the programming language of the code. Another
ZObject can call the function in some context which chooses an
implementation to run it. At a basic level this is working, but there's a
lot more to do to get where we want and be actually useful...
...
...
Arthur
_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia

Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia
--
Amirouche ~ https://hyper.dev

Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia

2024

2023

2022

2021

2020

[Abstract-wikipedia] Wiki of functions (was Re: Runtime considerations: Introducing GraalEneyj)