The on-wiki version of this newsletter can be found here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2023-09-20 -- Renderers and parsers for types
Wikifunctions currently supports two types: Strings and Booleans. To make Wikifunctions useful, we need to support many more types, such as numbers, dates, geocoordinates, and eventually Wikidata lexemes and items. Types define what kind of inputs and outputs the functions in Wikifunctions can have.
With Wikifunctions, we don’t want to just repeat what different programming languages have done, but, if possible, gently update the lessons that have been learned from programming language research and experience and make sure that we are as inclusive as possible.
Strings and Booleans were very carefully chosen for the first deployment of Wikifunctions: Strings https://www.wikifunctions.org/wiki/Z6, because they are just a specific sequence of Characters, and do not depend on the user’s language. Booleans https://www.wikifunctions.org/wiki/Z40, because they are a key basis of logic flow for programming. Further, they can be fully translated in Wikifunctions – the two values, True https://www.wikifunctions.org/wiki/Z41 and False https://www.wikifunctions.org/wiki/Z42, are both represented by a Wikifunctions object that can have names in any of the languages we support. Since the initial deployment, more than a dozen translations have been added! If you can add more, that would be great.
One example of a possible next type that would be interesting to introduce would be whole numbers. This raises a big question: how should we represent an integer?
Most programming languages have two answers for that: one, they internally represent it, usually, as a binary string of a specific length, in order to efficiently store and process these numbers. But then there is also their representation in the human-readable source code, and here they are usually represented as a sequence of Arabic numerals https://en.wikipedia.org/wiki/Arabic_numerals, e.g. 4657388. Some programming languages are nice enough to allow for grouping of the numbers, e.g. in Ada https://en.wikipedia.org/wiki/Ada_(programming_language) you may write 4_657_388, or, if you prefer the Indian system https://en.wikipedia.org/wiki/Indian_numbering_system, 46_57_388, making these numbers a bit more readable.
But programming languages where one can write ৪৬,৫৭,৩৮৮ using Bengali numerals https://en.wikipedia.org/wiki/Bengali_numerals, referring to the same number, are rare https://sjishan.github.io/chascript/. For Wikifunctions, we want to rectify this, to make sure that the whole system supports every human language fluently and consistently.
Internally, we will represent numbers - like every other object - as ZObjects. The above number would be represented internally as follows (using the prototype ZID from the Beta https://wikifunctions.beta.wmflabs.org/view/en/Z10015, since we don’t yet have the respective type in the real Wikifunctions):
{ "Z1K1": "Z10015", "Z10015K1": "4657388"}
Or, with labels in English:
{ "type": "positive integer", "value": "4657388"}
Even though this solves the internal representation, we would want to avoid displaying this object in the system if possible. Instead, we plan to allow the Wikifunctions community to attach a 'renderer' and a 'parser' to each type. The renderer would be a function that takes an object of the given type (in this case, an object of the type positive integer) and a language, and returns a string. The parser is the opposite of that: it takes a string and a language, and returns an object of type positive integer.
This would allow the Wikifunctions community to create functions for each type and language that would decide how the values of the type are going to be displayed in the given language. In a Bengali interface, the above number can then be displayed in the most natural representation for Bengali, which might be ৪৬,৫৭,৩৮৮.
When entering a number, we will use the parsing function to turn the input of the user into the internal representation. It is then up to the community to decide how flexible they want to be: if they would only accept ৪৬,৫৭,৩৮৮ as the input, or whether ৪৬৫৭৩৮৮ would be just as good - or even also or only 4657388. The decision would be for the Wikifunctions community to make.
Note that we made a lot of assumptions in the above text. For example, using the ZID from the Beta, calling the type “positive integer”, assuming the internal representation of positive integers being Arabic numerals without formatting (instead of say, hexadecimal, base 64 or a binary number, which also could be good solutions), and other assumptions. All of these decisions are up to you, but we used assumptions here to talk concretely about the proposal.
We plan to implement this proposal incrementally, over a few weeks and months. It will likely be the case that we will at first only accept the internal representation (just as it currently works on the Beta), and that we will then add renderers and finally parsers.
We are looking forward to hearing your feedback on this plan.
I think for long term clarity, it makes sense to have both *numeric literals* as well as *numeric primitives* where the primitives are given a default *numeric type* (I assume 64-bit support only here throughout Wikifunctions?). Further, it would be nice to have support eventually for *Arbitrary Precision Arithmetic* (GNU Multiple Precision Arithmetic Library https://gmplib.org/ and the GNU MPFR Library https://www.mpfr.org/) to allow computations with arbitrary-precision of integers and floating point numbers.
Julia https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers has good support for representations, and has numerous "good practices or gotchas to think about" besides what Denny is bringing up here. Its integer handling uses a form of modular arithmetic and has overflow behavior https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/#Overflow-behavior when the maximum representable value of a given type has been exceeded.
For example, another question would be if overflow behavior will be supported (or is wanted) in Wikifunctions or not:
julia> 10^19 -8446744073709551616
julia> big(10)^19 10000000000000000000
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/
abstract-wikipedia@lists.wikimedia.org