I've just had another thought about labels and
descriptions: if we're
planning to have label + shortdesc for each language for each object, why
don't we piggyback on Wikidata directly?
The object itself only needs to store one label and one shortdesc for
bootstrapping purposes. Once the Z object is connected to a Wikidata Q
item, all the labels and descriptions should be synced with the Q item.
The advantages of this implementation are
(1) No need for the wiki of functions to code the same functionality that
Wikidata already provides; and
(2) For Z objects that implement well-known algorithms (e.g. MD5 hash)
this encourages editors to find the Wikidata item about the same function
and link them, rather than producing a parallel dictionary of
language-specific labels and descriptions that serve the same purpose as
the associated Wikidata item. If a Z object's implementation needs to be
disambiguated from the most closely related Q item, we can always create a
new Q item with the Z object as its only sitelink.
--Deryck
On Thu, 22 Oct 2020, 21:15 Denny Vrandečić, <dvrandecic(a)wikimedia.org>
wrote:
A version of this newsletter with links and
formatting is available
on-wiki here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Upd
ates/2020-10-22
In this edition of the weekly posts, I want to discuss the different
places where we will use unstructured text in the objects of the wiki of
functions. The following is a plan and a request for comments. Besides the
labels, nothing is implemented yet, so we would really appreciate your
feedback.
Labels. Every object in the wiki of functions will be identified by a
Z-ID, similar to Q-IDs identifying items in Wikidata. But just like Q-IDs,
we don't expect Z-IDs to be widely visible and used. Instead, every object
will have labels, one per language.
But unlike Wikidata items, every object will be an instance of a specific
type. For example, there might be an object representing the addition of
two integer numbers. So the English label for this object could be “add”.
Other good labels for this object could be “addition”, “sum”, or “plus”. An
inappropriate name for the function would be “multiplication”, as that
would be terribly confusing.
Uniqueness. There would likely also be other objects representing
functions that do addition, for example the addition of two floating point
numbers, or of two complex numbers, or of two matrices. In the wiki of
functions labels will not need to be unique overall - but they will need to
be unique for each type. So there can only be one function with the label
“add” that takes two integers and returns one integer. Or only one type
with the label “integer”. Per type, each label must be unique.
Now does every object need a label? No. There will be many objects where
a label won’t be strictly necessary. Not every test for a function will
need a label, nor will every implementation of a function. They may have
labels, but they won’t be necessary - this is another difference to
Wikidata, where items without labels are almost always problematic.
One more note on labels - labels do not have to be direct translations
across languages. So in one language, two functions might have the same
label if they have a different type, but in another language the two
functions might have different labels as well. For example, in English
“length” might be an appropriate label for both a function that returns the
number of elements in a list, but also for a function that returns the
length of a river, or a function that returns the duration of a movie. In
Croatian, on the other hand, all three of these might have different names
(“broj elemenata”, “duljina”, “trajanje”). Each language can decide what
pattern works best for them. Whether verbs in the imperative mood (“add”)
or a description of the result (“sum”) or the name of the operation
(“addition”) work best can be decided from language to language, and
independent style guides for each language may evolve.
Aliases. Besides labels, every object can have additional aliases per
language. Aliases are helpful when searching for an object. The above
function, labeled “add” in English, could have all the other alternative
names given above - “sum”, “plus”, “addition” - as aliases, so that when
someone searches for one function by a different name they still find the
right one as a result.
Documentation. Every object may also have documentation in each language.
Documentation is some wikitext that further describes the given object.
Many objects will not have any documentation at all, but many will have
some. If you ever had the opportunity to read The Art of Computer
Programming, you will know that there is often a lot to say about a
function! But besides this kind of background story, we can also have some
documentation describing a given implementation, or some explanation for
why a given test is useful. It could also link to a Phabricator task
describing an error that used to be there and that this test is checking
for it in order to catch it before it resurfaces, or a link to other
resources such as a textbook on algorithms.
Keys and arguments. At least types and functions will also have labels
for each key of the type and for each argument of the function. These will
be used in creating the user interface to display and edit values and
function calls.
Short descriptions. One specific question we have is - should we have
optional short descriptions for each object? In many IDEs, and sometimes
even directly in the programming language (think docstrings in Python)
there is support for short one-liners that give a bit of extra information
for a function, beyond just the name of the function and the arguments and
the types.
We are going to have a strong type system, and we will have plenty of
space for the documentation. So do we really also need a place for short
descriptions? What would their use case be? How would they be used in the
UX? On the Website, something akin to the pop-up previews in Wikipedia seem
to be even more useful than a one-liner, previewing the whole documentation.
Originally I assumed per default that we would have short descriptions,
given their importance and usefulness in Wikidata, and so they also feature
in the AbstractText prototype. But they were never useful there. Also, the
type took over the role of the disambiguator, as described above, so there
was really no technical need for a short description. I currently lean
towards not having them, but I would like to hear more input.
The good thing is that no decision will be carved in stone. The data
model of the wiki of functions will be much more flexible than the data
model of Wikidata, and if we figure out that we do need short descriptions,
we can just introduce them later. It is much harder to remove things
though, because almost everything that is there gets used some way or the
other, so I am more wary of introducing features without a good reason.
Dogfooding. One obvious question is: hey, we are developing this
architecture to create multilingual content, why even have all this
documentation and labels and all that in actual languages, why not use our
own functions to build all of this up?
And yes, agreed, that would be best. It’s just, we’re not there yet, and
until then, we will still need labels and documentation and aliases. So
eventually I would very much look forward to using abstract content to
describe the objects of the wiki of functions themselves. It will also be
interesting to see if we can then roll back some of the local content, and
how open we will be to do so - as this will give us a lot of interesting
insight into how to approach similar goals in the other Wikimedia projects,
from descriptions (and maybe even labels? or sense glosses?) to stubby
Wikipedia articles, there will be plenty of potential to make our content
across projects be easier to maintain and to provide a more uniform
baseline coverage across languages.
New video introducing Abstract Wikipedia. A new presentation, given for
Wikidata’s Eighth Birthday, organized by the community of
WikiProject:India, is available:
https://www.youtube.com/watch?
v=GAb1HylGemA
Naming contest. Next week, Tuesday 27 October, the second round of voting
for the name of the new wiki of functions will begin. The proposals are
currently being vetted by legal, and we should know which names will be in
the final round soon.
_______________________________________________
Abstract-Wikipedia mailing list
Abstract-Wikipedia(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia