The on-wiki version of this newsletter is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-01-21
This week’s newsletter is a bit more technical than others. Before we get to the main content, a reminder: the logo concept submission https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_concept is currently open! Six logo proposals have already been made, and they are each very much worth a look. I hope to see more proposals coming in, the submission deadline is on 16 February.
The formal function model https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Function_model as it is currently written assumes that the whole of the function model is already implemented. In particular, it relies on generic types, which means a type that is parameterized, usually by another type.
What is an example of a parametric type? Let’s take a look at the type for lists. A list is an ordered series of elements. What is the type of these elements? It could be anything! There are plenty of operations on lists that can be performed without needing to know what the type of the elements in the list is. One can take the first element of the list, or reverse the order of the elements in the list, or take every second element from the list, or much more.
What is the type of the item that is returned by the function that outputs the first element of such a list? You can’t know. Since the elements of the list can be of any type, the return type of the function outputting the first element could also be any type.
Now, instead of having a single type called "list" you could also have types called, say, "list of strings", or "list of numbers", or "list of booleans". Or, even more complicated, "list of lists of strings". You could also have a list of any kind of elements, or multiple kinds at once. Now, if you have a function that returns the first element of a "list of strings", you know that the return type of that function will always be a string. You have more type safety https://en.wikipedia.org/wiki/Type_safety, and you can have your code-writing tools provide much better guidance when writing functions and function calls, because you know that there must be a string. It is also easier to write the functions because you don’t have to check for cases where there are elements of other types popping up.
But on the other side, we suddenly need many more new types. In theory, an infinite number of specialised types. And for each of these types, we would need all the specialised functions dealing with that type, leading to an explosion in the number of functions. This would be a maintenance nightmare, because there would be so much code that needs to be checked and written, and it all is very similar.
In order to avoid that, we could write functions that create the type and functions that we need, and that take the type of the elements of the list as an argument. So instead of having a type “list of strings”, we would have a function, call it “list”, that has a single argument, and that you can call “list(string)” and that would result, on the fly, in a type that is a list whose elements are strings. Or you can call “list(integer)” and you get a list whose elements are integers.
The same is true for functions: instead of having a function “first” for that only works on lists of string and returns a string, you would have a function that takes the type “string” as an argument and returns a function that works on a list of strings and returns a string. Instead of writing a “first” function for every kind of list, we would write a function that creates these functions and then call them when needed.
There are also situations where the dimension of typed input on which you need to operate is limited by number, rather than or as well as type. As a more complicated example, if we had a method to do matrix dot multiplication https://en.wikipedia.org/wiki/Matrix_multiplication, the number of columns in the first input matrix and rows in the second must match. There, instead of taking just a type to create the matrix (say, a floating point number), our top-level type function would take two integers as the numbers of rows and columns as well. We could then call this method with matrix(float,4,3).
Similarly, Earth-based ground positioning information is generally relayed by two dimensions of degree, and optionally one of altitude; there you might have tuple(float,2) for the former, and tuple(float,3) for the latter. The exact way that the Wikifunctions community would decide to model the position type is left up to them – in deciding on a type, they might also want to explicitly specify the planetary body, the datum, the accuracy in each dimension, or other data as well. We just need to make sure we provide the flexibility to editors to represent things they will need or want. Note also that common types, such as geocoordinates, will likely be created as named types on wiki, but structures can always be created on-the-fly too.
This idea has many different names and concrete realisations in different programming languages, such as templates, concepts, parametric polymorphism, generic functions, and data types, etc. For our purposes, we call this generics, and it is currently scheduled to be implemented in Phase ζ https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B6_(zeta):_generic_types. Now the thing is that the function model as it is currently described relies heavily on generics. But until they are in place, we can’t really use them. So we are in kind of a limbo, where the precise function model we are implementing right now is not specified anywhere, and instead we are adjusting on the fly based on the current state of the implementation and where we want to eventually end up.
In order to support that, we are publishing an explicit pre-generic function model https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Pre-generic_function_model. Note that this also does not describe the model as it is right now, but it is the model that we are going to implement mostly by the end of Phase γ https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B3_(gamma):_functions,_implementations,_errors and thus is much more immediately useful than the final function model that we have currently described on-wiki. The idea is then that once we get around to support generics, we will shift over from the pre-generic function model to the full function model.
Comments and suggestions on the pre-generic function model and the plan presented here are, as always, very welcome.
Background Wikipedia articles:
-
Generic programming https://en.wikipedia.org/wiki/Generic_programming -
Parametric polymorphism https://en.wikipedia.org/wiki/Parametric_polymorphism -
Generic function https://en.wikipedia.org/wiki/Generic_function -
Type safety https://en.wikipedia.org/wiki/Type_safety
And remember the ongoing logo concept proposal contest https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_concept !