That's good thinking, Deryck. I'm not convinced that such an approach should apply to all Z2/persistent objects but it seems appropriate for the more fundamental ones. I guess most of the earlier Z2s will be "more fundamental", so your approach allows something that works well at the beginning and can be changed later in the light of experience, as Denny suggested.

Another advantage is that it offers an implicit entry point into the Wiki of functions. If you happened to be looking at the Wikidata item for "MD5 hash" (https://www.wikidata.org/wiki/Q185235), you would see a link to the Wiki of functions' Z8/function (or other Z2, or something). If there is any ambiguity (as you envisage), it might be more appropriate to use Wikidata properties for disambiguation; effectively there is an implied Category of Z2 with more than one member, but I'm not sure whether the Wiki of functions will be using Categories.

There is also the possibility of linking to items in the Wikidata Lexeme namespace. At the moment, Lexemes can link back to Q items through their Senses. We seem to be lacking the concept of an abstract Sense or the sense of an abstract lexeme (which I consider essential for the language-neutral content of Abstract Wikipedia), but we can at least refer directly to the more precise Sense of a Lexeme, which can also have translations that might be appropriate. That encourages more precise definitions and improved translatability for the target Lexemes too, of course.

Best regards,
Al.

On Thursday, 5 November 2020, Deryck Chan <deryckchan@gmail.com> wrote:
I've just had another thought about labels and descriptions: if we're planning to have label + shortdesc for each language for each object, why don't we piggyback on Wikidata directly?

The object itself only needs to store one label and one shortdesc for bootstrapping purposes. Once the Z object is connected to a Wikidata Q item, all the labels and descriptions should be synced with the Q item.

The advantages of this implementation are
(1) No need for the wiki of functions to code the same functionality that Wikidata already provides; and
(2) For Z objects that implement well-known algorithms (e.g. MD5 hash) this encourages editors to find the Wikidata item about the same function and link them, rather than producing a parallel dictionary of language-specific labels and descriptions that serve the same purpose as the associated Wikidata item. If a Z object's implementation needs to be disambiguated from the most closely related Q item, we can always create a new Q item with the Z object as its only sitelink.

--Deryck

On Thu, 22 Oct 2020, 21:15 Denny Vrandečić, <dvrandecic@wikimedia.org> wrote:
A version of this newsletter with links and formatting is available on-wiki here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2020-10-22

In this edition of the weekly posts, I want to discuss the different places where we will use unstructured text in the objects of the wiki of functions. The following is a plan and a request for comments. Besides the labels, nothing is implemented yet, so we would really appreciate your feedback.

Labels. Every object in the wiki of functions will be identified by a Z-ID, similar to Q-IDs identifying items in Wikidata. But just like Q-IDs, we don't expect Z-IDs to be widely visible and used. Instead, every object will have labels, one per language.

But unlike Wikidata items, every object will be an instance of a specific type. For example, there might be an object representing the addition of two integer numbers. So the English label for this object could be “add”. Other good labels for this object could be “addition”, “sum”, or “plus”. An inappropriate name for the function would be “multiplication”, as that would be terribly confusing.

Uniqueness. There would likely also be other objects representing functions that do addition, for example the addition of two floating point numbers, or of two complex numbers, or of two matrices. In the wiki of functions labels will not need to be unique overall - but they will need to be unique for each type. So there can only be one function with the label “add” that takes two integers and returns one integer. Or only one type with the label “integer”. Per type, each label must be unique.

Now does every object need a label? No. There will be many objects where a label won’t be strictly necessary. Not every test for a function will need a label, nor will every implementation of a function. They may have labels, but they won’t be necessary - this is another difference to Wikidata, where items without labels are almost always problematic.

One more note on labels - labels do not have to be direct translations across languages. So in one language, two functions might have the same label if they have a different type, but in another language the two functions might have different labels as well. For example, in English “length” might be an appropriate label for both a function that returns the number of elements in a list, but also for a function that returns the length of a river, or a function that returns the duration of a movie. In Croatian, on the other hand, all three of these might have different names (“broj elemenata”, “duljina”, “trajanje”). Each language can decide what pattern works best for them. Whether verbs in the imperative mood (“add”) or a description of the result (“sum”) or the name of the operation (“addition”) work best can be decided from language to language, and independent style guides for each language may evolve.

Aliases. Besides labels, every object can have additional aliases per language. Aliases are helpful when searching for an object. The above function, labeled “add” in English, could have all the other alternative names given above - “sum”, “plus”, “addition” - as aliases, so that when someone searches for one function by a different name they still find the right one as a result.

Documentation. Every object may also have documentation in each language. Documentation is some wikitext that further describes the given object. Many objects will not have any documentation at all, but many will have some. If you ever had the opportunity to read The Art of Computer Programming, you will know that there is often a lot to say about a function! But besides this kind of background story, we can also have some documentation describing a given implementation, or some explanation for why a given test is useful. It could also link to a Phabricator task describing an error that used to be there and that this test is checking for it in order to catch it before it resurfaces, or a link to other resources such as a textbook on algorithms.

Keys and arguments. At least types and functions will also have labels for each key of the type and for each argument of the function. These will be used in creating the user interface to display and edit values and function calls.

Short descriptions. One specific question we have is - should we have optional short descriptions for each object? In many IDEs, and sometimes even directly in the programming language (think docstrings in Python) there is support for short one-liners that give a bit of extra information for a function, beyond just the name of the function and the arguments and the types.

We are going to have a strong type system, and we will have plenty of space for the documentation. So do we really also need a place for short descriptions? What would their use case be? How would they be used in the UX? On the Website, something akin to the pop-up previews in Wikipedia seem to be even more useful than a one-liner, previewing the whole documentation.

Originally I assumed per default that we would have short descriptions, given their importance and usefulness in Wikidata, and so they also feature in the AbstractText prototype. But they were never useful there. Also, the type took over the role of the disambiguator, as described above, so there was really no technical need for a short description. I currently lean towards not having them, but I would like to hear more input.

The good thing is that no decision will be carved in stone. The data model of the wiki of functions will be much more flexible than the data model of Wikidata, and if we figure out that we do need short descriptions, we can just introduce them later. It is much harder to remove things though, because almost everything that is there gets used some way or the other, so I am more wary of introducing features without a good reason.

Dogfooding. One obvious question is: hey, we are developing this architecture to create multilingual content, why even have all this documentation and labels and all that in actual languages, why not use our own functions to build all of this up?

And yes, agreed, that would be best. It’s just, we’re not there yet, and until then, we will still need labels and documentation and aliases. So eventually I would very much look forward to using abstract content to describe the objects of the wiki of functions themselves. It will also be interesting to see if we can then roll back some of the local content, and how open we will be to do so - as this will give us a lot of interesting insight into how to approach similar goals in the other Wikimedia projects, from descriptions (and maybe even labels? or sense glosses?) to stubby Wikipedia articles, there will be plenty of potential to make our content across projects be easier to maintain and to provide a more uniform baseline coverage across languages.

New video introducing Abstract Wikipedia. A new presentation, given for Wikidata’s Eighth Birthday, organized by the community of WikiProject:India, is available: https://www.youtube.com/watch?v=GAb1HylGemA

Naming contest. Next week, Tuesday 27 October, the second round of voting for the name of the new wiki of functions will begin. The proposals are currently being vetted by legal, and we should know which names will be in the final round soon.

_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia