- Abstract-Wikipedia - lists.wikimedia.org

URL-addressable statements and clusters of statements
by Adam Sobieski 19 Feb '21

19 Feb '21

Hello. The following ideas about URL-addressable statements and clusters of statements (e.g. paraphrase sets or clusters) are relevant to a recent Wikifact project proposal<https://meta.wikimedia.org/wiki/Wikifact>, could be relevant to a recent Wikipragmatica project proposal<https://meta.wikimedia.org/wiki/Wikipragmatica>, and, hopefully, are relevant and interesting to Wikidata and Abstract Wikipedia. Each statement, claim, or fact could have a URL. Each cluster of paraphrases could have a URL. Statements, claims, or facts could have URL’s, for instance https://www.wikifact.org/statements/33DCF305-3A4D-4024-9AD7-CCB1A29054E2 . Clusters of paraphrases could have URL’s, for instance https://www.wikifact.org/clusters/D006871E-24A6-428F-BD1F-D20C3C7B7685 . The URL for an individual statement, claim, or fact could, while optionally providing data, redirect to a URL for the paraphrase cluster which contains it. This could convenience processes of semi-automated, collaborative paraphrasing. That is, in the event of an erroneous paraphrasing, editors or software tools could edit a redirect page to re-cluster the individual statement, claim, or fact to an updated cluster of paraphrases. At the URL for a paraphrase cluster could be a human-editable sequence of explained annotations about a statement, claim, or fact. The emergent feature of URL-addressability could convenience Web-based communication about statements, claims, and facts. End-users would be able to share hyperlinks to fact-checking articles about individual statements, claims, or facts. This could facilitate a number of other, related technologies. Also interestingly, statement patterns could be expressed and these patterns could be utilized via URL query strings. Nouns or noun phrases could be provided as arguments. That is, arguments for thematic relations could be provided utilizing Wikidata lexemes and entities. https://www.wikifact.org/patterns/293FCD5D-27A7-498A-81C3-C78EF0F9D9A2?agen… could represent a set of statements expressing that “Douglas Adams ate an apple.” Best regards, Adam Sobieski

6 11

Newsletter #19: A test suite for evaluation engines (and other updates)
by Denny Vrandečić 18 Feb '21

18 Feb '21

The on-wiki version of this newsletter can be found here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-02-18 -- Development has been active. We are deep into Phase γ <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B3_(gam…>, working on supporting the core types for Wikifunctions, including functions, implementations, testers, errors, and so on. We are removing some major blockers for further development. At the same time, we have already begun our work on the larger architecture of the system, in particular our evaluation engine with support for one native programming language. The evaluation engine is the part of Wikifunctions responsible for evaluating function calls. That is, it is the part that gets asked, “Hey, what’s the sum of 3 and 5?” and answers, “8”. Our evaluation engine is principally separated into two main parts: the function orchestrator <https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/function-…>, which receives the calls and collates the functions and any data needed to process and evaluate the calls; and then the function executor <https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/function-…>, which runs the contributor-written code, as instructed by the orchestrator. As the executor can run uncontrolled native code, it lives in a tightly controlled environment and has only minimal permissions beyond the limited use of compute and memory. The orchestrator will also rely heavily on caching: if we have just calculated the sum of 3 and 5, and someone else asks for that too, we’ll just take it from the cache instead of re-running the computation. We'll also cache function definitions and inputs within the orchestrator, so that if someone asks for the sum of 3 and 6 we can answer more swiftly. But this is just our production evaluation engine. We are hoping that several other evaluation engines will be built, like the GraalVM-based one <https://github.com/lucaswerkmeister/graaleneyj> on which Lucas Werkmeister is already working. In order to support the development of evaluation engines, we are working on a test suite that other evaluation engines can use for conformance testing. If you’re interested in joining that effort, drop a note on this task <https://phabricator.wikimedia.org/T275093>. The test suite, as well as the common code used by several parts of our system to handle ZObjects, will live in a new library repository, function-schemata <https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/services/function-…> . This development has been a bit out of order from the original plan we conceived last August. In fact, we are thinking of changing the order of some of the developments, and we expect to do significant parts of it in parallel. Having the evaluation engine available earlier makes it possible to start the security and performance reviews in a timely manner, and to validate our architectural plans. Originally, we had only planned for an evaluation engine that understands a programming language in Phase θ, and to support only a single programming language until after launch. We have now changed that to be much sooner, and also we plan to support at least two programming languages right at launch. This change will help us avoid the pitfall of possibly getting stuck with a design that only works for one programming language. Having two or more will better commit us to a multi-environment project, in terms of programming languages. In other news The deadline for submissions to the Wikifunctions logo concept <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…> is coming closer: submissions are accepted until Tuesday, 23 February, followed by a two-day discussion before the voting on which concept to develop starts on Thursday, 25 February. Currently, we have 17 submissions (and some additional variants). There have been a number of talks and external articles which may be of interest. We gave a presentation at the Graph Technologies in the Humanities: 2021 Virtual Symposium <https://graphentechnologien.hypotheses.org/tagungen/graph-technologies-in-t…>. You can watch our pre-recorded presentation <https://thm.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=30dcef00-63a2-44b…> for the symposium. It was followed by ample time to discuss the project; unfortunately, the discussion itself will not be published. We also presented at the NSF Convergence Accelerator Series <http://spatial.ucsb.edu/2021/Denny-Vrandecic>.The talk is very similar to the previous talk, but this recording includes the discussion following the talk. The Tool Box Journal - A Computer Journal For Translation Professionals Issue 322 <https://internationalwriters.com/toolkit/current.html> reports on Abstract Wikipedia, Wikifunctions, and Wikidata. I found it very interesting to see how the projects are perceived by professional translators, and their comparison of Wikidata to a termbase. The German magazine Der Spiegel published an interview with Denny <https://www.spiegel.de/netzwelt/web/wikipedia-wird-20-wenn-google-das-proje…> about Abstract Wikipedia. They also published a more comprehensive article in their 16 January print issue, which is available in their archive for subscribers <https://www.spiegel.de/netzwelt/web/wie-wikipedia-zu-einer-uebersetzungsmas…>. Both the interview and the article are in German.

1 0

Newsletter #18: Two prototype tools to visualize lexicographic coverage in Wikidata
by Denny Vrandečić 10 Feb '21

10 Feb '21

The on-wiki version of this newsletter is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-02-10 The goal of Abstract Wikipedia is to generate natural language texts from an abstract representation of the content to be represented. In order to do so, we will use lexicographic data from Wikidata. And although we are quite far from being able to generate texts, one thing that we want to encourage everyone’s help with is the coverage and completeness of the lexicographic data in Wikidata. Today we want to present prototypes of two tools that could help people to visualize, exemplify, and better guide our understanding of the coverage of lexicographic data in Wikidata. Annotation interface The first prototype is an annotation interface that allows users to annotate sentences in any language, associating each word or expression with a Lexeme from Wikidata, including picking its Form and Sense. You can see an example in the screenshot below. Each ‘word’ of the sentence here is annotated with a Lexeme (the Lexeme ID L31818 <https://www.wikidata.org/wiki/Lexeme:L31818> is given just under the word), followed by the lemma, the language, and the part of speech. Then comes, if selected, the specific Form that is being used in context - for example, on ‘dignity’ we see the Form ID L31818#F1, which is the singular Form of the Lexeme. Lastly, comes the Sense, which is assigned Sense ID L31818#S1 and defined by a gloss. At any time, you can remove any of the annotations, or add new annotations. Some of the options will take you directly to Wikidata. For example, if you want to add a Sense to a given Lexeme, because it has no Senses or is missing the one you need, it will take you to Wikidata and let you do that there in the normal fashion. Once added there, you can come back and select the newly added Sense. The user interface of the prototype is a bit slow, so please give it a few seconds when you initiate an action. It should work out of the box in different languages. The Universal Language Selector is available (at the top of the page), which you can use to change the language. Note that glosses of Senses are frequently only available in the language of the Lexeme, and the UI doesn’t yet do language fallback, so if you look at English sentences with a German UI you might often find missing glosses. Technologically, this is a prototype entirely implemented in JavaScript and CSS on top of a vanilla MediaWiki installation. This is likely not the best possible technical solution for such a system, but should help to determine if there is any user-interest in the tool, for a potential reimplementation. Also, it would be a fascinating task to agree on an API which can be implemented by other groups to provide the selection of Lexemes, Senses, and Forms for input sentences. The current baseline here is extremely simple, and would not be good enough for an automated tagging system. Having this available for many sentences in many languages could provide a great corpus for training natural language understanding systems. There is a lot that could be built upon that. The goal of this prototype is to make more tangible the Wikidata community's progress regarding the coverage of the lexicographical data. You can take a sentence in any written language, put it into this system, and find out how complete you can get with your annotations. It's a way to showcase and create anecdotal experience of the lexicographic data in Wikidata. The prototype annotation interface is at: http://annotation.wmcloud.org/ You can discuss it here: https://annotation.wmcloud.org/wiki/Discussion (You will need to create a new account - if you have time to set this up with SUL, drop me a line) Corpus coverage dashboard The second prototype tool is a dashboard that shows the coverage of the data compared to a corpus in each of forty languages. Last year, whilst in my previous position at Google Research, I co-authored a publication where we built and published language models out of the cleaned-up text of about forty Wikipedia language editions [1]. Besides the language models, we also published the raw data: this text has been cleaned up by the pre-processing system that Google uses on Wikipedia text in order to integrate the text in several of its features. So while this dataset consists of relatively clean natural language text; certainly, compared to the raw wiki text — it still contains plenty of artefacts. If you know of better large scale encyclopedic text corpora we can use, maybe better cleaned-up versions of Wikipedia, or ones covering more languages, please let us know <https://phabricator.wikimedia.org/T273221>. We extracted these texts from the TensorFlow models <https://www.tensorflow.org/datasets/catalog/wiki40b>. We provide the extracted texts for download <https://drive.google.com/drive/folders/1HfL138UCqr69w0XfAhlAEUh6VVOnzwBE> (a task <https://phabricator.wikimedia.org/T274208> to move it to Wikimedia servers is underway). We split the text into tokens and count the occurrences of words, and compared how many of these tokens appear in the Forms on Lexemes of the given language in Wikidata’s lexicographic data. If this proves useful, we could move the cleaned-up text to a more permanent home. A screenshot of the current state for English is given here: we see how many Forms for this language are available in Wikidata, and we see how many different Forms are attested in Wikipedia (i.e., how many different words, or word types, are in the Wikipedia of the given language). The number of tokens is the total number of words in the given language corpus. Covered forms says how many of the forms in the corpus are also in Wikidata's Lexeme set, and covered tokens tells us how many of the occurrences that covers (so, if the word ‘time’ appears 100 times in English Wikipedia, it would be counted as one covered form, but 100 covered tokens). The two pie charts visualize the coverage of forms and tokens respectively. Finally, there is a link to the thousand most frequent forms that are not yet in Wikidata. This can help communities prioritise ramping up coverage quickly. Note though, the progress report is manual and does not automatically update. I plan to run an update from time to time for now. The prototype corpus coverage dashboard is at: https://www.wikidata.org/wiki/Wikidata:Lexicographical_coverage You can discuss it here: https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_coverage Help wanted Both prototype tools are exactly that: prototypes, not real products. We have not committed to supporting and developing these prototypes further. At the same time, all of the code and data is of course open sourced. If anyone would like to pick up the development or maintenance of these prototypes, you would be more than welcome – please let us know (on my talk page <https://meta.wikimedia.org/wiki/User_talk:DVrandecic_(WMF)>, or via e-mail, or on the Tool ideas page <https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Ideas_of_tools> ). Also, if someone likes the idea but thinks that a different implementation would be better, please move ahead with that – I am happy to support and talk with you. There is much to improve here, but we hope that these two prototypes will lead to more development of content and tools <https://www.wikidata.org/wiki/Wikidata:Tools/Lexicographical_data> in the space of lexicographic data. [1] Mandy Guo, Zihang Dai, Denny Vrandečić, Rami Al-Rfou: Wiki-40B: Multilingual Language Model Dataset, LREC 2020, https://www.aclweb.org/anthology/2020.lrec-1.297/

1 0

Newsletter #17: Phase β completed
by Denny Vrandečić 05 Feb '21

05 Feb '21

The on-wiki version of this edition of the newsletter is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-02-04 When we started the development effort towards the Wikifunctions site, we subdivided the work leading up to the launch of Wikifunctions into eleven phases <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases>, named after the letters of the Greek alphabet. This week we completed the second phase, Phase β <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B2_(bet…> . With Phase α <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B1_(alp…> completed, it became possible to create instances of the system-provided Types in the wiki. This meant that one could go to the wiki and create, for example, a string, such as this Hello World! <https://notwikilambda.toolforge.org/wiki/ZObject:Z101> String in Lucas <https://meta.wikimedia.org/wiki/User:Lucas_Werkmeister>’s notwikilambda demonstration <https://notwikilambda.toolforge.org/wiki/Main_Page> system. The goal of Phase β <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B2_(bet…> was to allow the creation of Types on-wiki, and to allow the creation of instances of these Types. The assumption is that we will provide just a very small set of core Types, and that almost all of the Types will be defined by the community on wiki. We have discussed Types before in Newsletters #7 <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2020-11-10> and #15 <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-01-21>. As we discussed there, a good Type system can be very helpful in maintaining and working with the catalogue of functions. It can help with choosing the right function, with navigating and exploring the catalogue, and with finding errors in the function implementations. In order to demonstrate that we indeed completed Phase β, we created a Type for Positive Integers <https://notwikilambda.toolforge.org/wiki/ZObject:Z70> on the notwikilambda test site, and a literal instance of that Type for the number one <https://notwikilambda.toolforge.org/wiki/ZObject:Z701> (To make it clear, we do not expect to have a page for every natural number, in fact it might well be that the community decides to restrict their creation as persistent objects. They will be usually created and passed through as literals that are created on the fly. If you are interested in a catalogue of natural numbers, I can refer you to the Linked Open Numbers <http://km.aifb.kit.edu/projects/numbers/> project). We have also considerably improved the user interface in Phase β, and it is now often using labels next to the bare identifiers. The labels are fully internationalized, and can be localized on-wiki (e.g. the labels for the Type Positive integer can be edited right on the Type page <https://notwikilambda.toolforge.org/wiki/ZObject:Z70>, as the key of the type). Note that in order to have editing privileges on that wiki, you need to be logged in. Many parts of the user interface used to be hard coded, but now are dynamically pulled from the wiki. There are of course bugs, and we are tracking them on the Phabricator task board <https://phabricator.wikimedia.org/project/view/4876/>. If you encounter new bugs, please do raise them. Either let us know, or create a new bug report in the column “Needs triage” so that we know to review them. We are starting now with Phase γ <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B3_(gam…>. The goal is to create all the main Types of the pre-generic function model <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Pre-generic_function_mod…> - Function, Implementation, Tester, Function call, Error, and so on. There are a number of tasks that will allow us to create these Types, particularly Function call will have some magic features to them. A bit out of order, we also started developing the supporting services, the function orchestrator <https://gerrit.wikimedia.org/g/mediawiki/services/function-orchestrator/+/r…> and the function evaluator <https://gerrit.wikimedia.org/g/mediawiki/services/function-evaluator/+/refs…>. This is in order to get input on the architecture <https://www.mediawiki.org/wiki/Extension:WikiLambda> as soon as possible. Once the function data model is in place, Phase δ <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B4_(del…> will allow evaluating the function calls that we are building in the current phase. This, and the composition of functions that will be enabled in Phase ε <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B5_(eps…>, will be the beating heart of the technical features that Wikifunctions will provide. After δ it will be possible for everyone to call functions from their wiki pages, and after ε it will be possible to create any kind of function. Sure, there will still be tons of things that need to be developed and improved, without question — but these will be the main steps towards providing a glimpse into what Wikifunctions will bring to the Wikimedia movement and beyond. I want to end with a big shout out to the whole team, and to the volunteers who were contributing patches - in particular Arthur P. Smith and Gabriel Lee – and give a pointer to the logo concept contest <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…>. The submission deadline is on 23 February, and we already have nine great submissions! Take a look, and be invited to add your own concept submissions, ideas, and comments on others', and let others know who might be interested.

1 0

Newsletter #15: A pre-generic function model
by Denny Vrandečić 28 Jan '21

28 Jan '21

The on-wiki version of this newsletter is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-01-21 This week’s newsletter is a bit more technical than others. Before we get to the main content, a reminder: the logo concept submission <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…> is currently open! Six logo proposals have already been made, and they are each very much worth a look. I hope to see more proposals coming in, the submission deadline is on 16 February. The formal function model <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Function_model> as it is currently written assumes that the whole of the function model is already implemented. In particular, it relies on generic types, which means a type that is parameterized, usually by another type. What is an example of a parametric type? Let’s take a look at the type for lists. A list is an ordered series of elements. What is the type of these elements? It could be anything! There are plenty of operations on lists that can be performed without needing to know what the type of the elements in the list is. One can take the first element of the list, or reverse the order of the elements in the list, or take every second element from the list, or much more. What is the type of the item that is returned by the function that outputs the first element of such a list? You can’t know. Since the elements of the list can be of any type, the return type of the function outputting the first element could also be any type. Now, instead of having a single type called "list" you could also have types called, say, "list of strings", or "list of numbers", or "list of booleans". Or, even more complicated, "list of lists of strings". You could also have a list of any kind of elements, or multiple kinds at once. Now, if you have a function that returns the first element of a "list of strings", you know that the return type of that function will always be a string. You have more type safety <https://en.wikipedia.org/wiki/Type_safety>, and you can have your code-writing tools provide much better guidance when writing functions and function calls, because you know that there must be a string. It is also easier to write the functions because you don’t have to check for cases where there are elements of other types popping up. But on the other side, we suddenly need many more new types. In theory, an infinite number of specialised types. And for each of these types, we would need all the specialised functions dealing with that type, leading to an explosion in the number of functions. This would be a maintenance nightmare, because there would be so much code that needs to be checked and written, and it all is very similar. In order to avoid that, we could write functions that create the type and functions that we need, and that take the type of the elements of the list as an argument. So instead of having a type “list of strings”, we would have a function, call it “list”, that has a single argument, and that you can call “list(string)” and that would result, on the fly, in a type that is a list whose elements are strings. Or you can call “list(integer)” and you get a list whose elements are integers. The same is true for functions: instead of having a function “first” for that only works on lists of string and returns a string, you would have a function that takes the type “string” as an argument and returns a function that works on a list of strings and returns a string. Instead of writing a “first” function for every kind of list, we would write a function that creates these functions and then call them when needed. There are also situations where the dimension of typed input on which you need to operate is limited by number, rather than or as well as type. As a more complicated example, if we had a method to do matrix dot multiplication <https://en.wikipedia.org/wiki/Matrix_multiplication>, the number of columns in the first input matrix and rows in the second must match. There, instead of taking just a type to create the matrix (say, a floating point number), our top-level type function would take two integers as the numbers of rows and columns as well. We could then call this method with matrix(float,4,3). Similarly, Earth-based ground positioning information is generally relayed by two dimensions of degree, and optionally one of altitude; there you might have tuple(float,2) for the former, and tuple(float,3) for the latter. The exact way that the Wikifunctions community would decide to model the position type is left up to them – in deciding on a type, they might also want to explicitly specify the planetary body, the datum, the accuracy in each dimension, or other data as well. We just need to make sure we provide the flexibility to editors to represent things they will need or want. Note also that common types, such as geocoordinates, will likely be created as named types on wiki, but structures can always be created on-the-fly too. This idea has many different names and concrete realisations in different programming languages, such as templates, concepts, parametric polymorphism, generic functions, and data types, etc. For our purposes, we call this generics, and it is currently scheduled to be implemented in Phase ζ <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B6_(zet…>. Now the thing is that the function model as it is currently described relies heavily on generics. But until they are in place, we can’t really use them. So we are in kind of a limbo, where the precise function model we are implementing right now is not specified anywhere, and instead we are adjusting on the fly based on the current state of the implementation and where we want to eventually end up. In order to support that, we are publishing an explicit pre-generic function model <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Pre-generic_function_mod…>. Note that this also does not describe the model as it is right now, but it is the model that we are going to implement mostly by the end of Phase γ <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B3_(gam…> and thus is much more immediately useful than the final function model that we have currently described on-wiki. The idea is then that once we get around to support generics, we will shift over from the pre-generic function model to the full function model. Comments and suggestions on the pre-generic function model and the plan presented here are, as always, very welcome. Background Wikipedia articles: - Generic programming <https://en.wikipedia.org/wiki/Generic_programming> - Parametric polymorphism <https://en.wikipedia.org/wiki/Parametric_polymorphism> - Generic function <https://en.wikipedia.org/wiki/Generic_function> - Type safety <https://en.wikipedia.org/wiki/Type_safety> And remember the ongoing logo concept proposal contest <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…> !

2 4

Newsletter #16: Community contributions to development
by Denny Vrandečić 28 Jan '21

28 Jan '21

The on-wiki version is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-01-28 Today I want to give an update on the development work, with a particular highlight on the community contributions - and renew our invitation for volunteer developers to join! We have defined the work to get us to the initial launch of Wikifunctions into 11 phases, of which we have finished one. Phase α <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B1_(alp…> was about the ability to create objects on wiki, and our current Phase β <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Phases#Phase_%CE%B2_(bet…> is about creating new Types and creating instances of such Types. In the backend of the wiki, we have been mostly busy with turning the hard-coded validators we had so far into validators that are based on the type definitions inside the wiki. This task is still on-going, and will remain so beyond Phase β.In the end, the validation of instances will be performed by functions written by the community and stored on the wiki itself, so we will need the ability to define and run functions in order to fully allow their validation. Until then, we will have incomplete validators that will become increasingly more comprehensive. The frontend work for supporting user-defined types has also progressed, and here we saw a number of contributions from two volunteers, Gabriel Lee and Arthur P. Smith. Thanks to them, a lot of the identifiers are now also enriched with the labels in the user’s language, and adding data is becoming much easier than before. As usual, you can see and experience the results right on the "notwikilambda" demo system <https://notwikilambda.toolforge.org/wiki/Main_Page> set up by another person volunteering, Lucas Werkmeister, who in their "day job" works on Wikidata. We try to reach out to a volunteer when we notice them contributing code. We plan to invite them to chats with individual members of the team, discussing the tasks, and also invite them to some of our daily stand-up meetings, if the timing permits it. We have had a few people join one or more of our stand-ups so far, and we are looking forward to inviting more. There are several ways a developer can contribute at the given time, and the ones that I am going to list here are by far not complete. One way is to take a look at our task board on Phabricator <https://phabricator.wikimedia.org/project/view/4876/>, and see if there is a task that you would like to work on. This is probably the best way to contribute to the WikiLambda extension <https://www.mediawiki.org/wiki/Extension:WikiLambda> itself, to work on our UI front-end and the MediaWiki back-end. You might instead find a task to work on within the stand-alone services that will support Wikifunctions – the function-orchestrator <https://gerrit.wikimedia.org/g/mediawiki/services/function-orchestrator> and the function-evaluator <https://gerrit.wikimedia.org/g/mediawiki/services/function-evaluator>. Beyond Phabricator, there are also development projects which are less tied in to our daily development work, and I would love to see those happen. For example, Lucas Werkmeister has started an alternative implementation of the function evaluator <https://github.com/lucaswerkmeister/graaleneyj> of an earlier prototype of the project, based on GraalVM. Other, parallel evaluation engines would also be very interesting. For example, one running on cloud computing resources, one running in the browser, or one running on the local machine. Besides that, I would love to see alternative interfaces to interact with the Wikifunctions system once it is in place. Possible interfaces might be viewing and/or editing via a modern CLI, through a Website that is not based on MediaWiki (perhaps hosted on Cloud Services <https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_Introduction>), through a voice-based or touch-based interface, within a Hadoop workflow, or even within a spreadsheet. Some of these concepts might still be a bit premature, but if you are interested in one of these, let me know and we will find a place on-wiki to collect and discuss ideas. I would particularly love to see an interactive visualization of a function evaluation, based on a composed implementation, allowing the user to better understand how a specific value is derived. Such projects can be started more or less independently of the main Wikifunctions work (although, admittedly, it might often make sense to wait until the project has stabilized a bit, in order not to develop against a moving target). As the project progresses, we expect that more and more people with an increasingly diverse skill set will feel able to contribute to the project. Until then, there are already a few other ways you can contribute if you’re not a developer: - We are currently looking for submissions of logo concepts <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…> (we also moved the submission deadline for logo concepts to 23 February. - You can join the Wikidata community to increase the coverage of the lexicographic knowledge in Wikidata <https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_data> (we are working on a few ideas in this direction that we will talk about in the coming weeks). - You can read through our documentation <https://meta.wikimedia.org/wiki/Abstract_Wikipedia> and ask questions, make clarifications, and help us to make the project more accessible. - There was the suggestion to collect possible user stories <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/User_stories>, which would also be very welcome. As the project progresses, more and more tasks will become available. We hope you will keep an eye on our progress, and join us whenever you feel ready to contribute!

1 0

Newsletter #14: Looking for a logo for Wikifunctions
by Denny Vrandečić 15 Jan '21

15 Jan '21

The on-wiki version of this newsletter can be found here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-01-15 We are looking for the Wikifunctions logo concept <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…> ! Just as with the name, we are starting with a request to submit proposals. As time of this writing, three proposals are already in - we are looking for more proposals for logo concepts. The deadline for submitting proposals is in a month from now, February 16, 2021, followed by a single round of voting. As with Wikidata, the proposals don't have to be fully polished. There will be a final step of legal review and design polishing. We invite everyone to participate in the logo concept process and submit your proposals. I am amazed by the creativity and productivity of the communities to come up with logos for our projects. My personal favorite, by quite a stretch, is the Wikidata logo. Maybe it is because I am obviously biased towards that project, but I also love its recognizability, the layers of meaning it has, its flexibility to change and accommodate events and situations by changing sneakily the word that the bar encodes, etc. I wish that the Wikifunctions logo will contain a similar form of flexibility. That it is similarly recognizable, even when sized down a lot (the favicon for Wikidata <https://commons.wikimedia.org/wiki/File:Wikidata_Favicon_color.svg> is really neat). I could imagine that going through the works and notes of people like Alan Turing, Ada Lovelace, Gottfried Wilhelm Leibniz, could lead to interesting inspirations. A Turing machine? Lambda calculus? Aristotelian logic and its various notations? Maybe something more practical. A punched card as it was used for the Jacquard loom? A punchhole card could make an interesting basis for a logo. Gears and steam-power are another possible basis, as they have implemented ‘smart’ appliances such as automatic doors for millenia. Or semiconductor chips and boards. All of these have the potential to look very detailed and fizzy when they get scaled down. So it might be worthwhile to explore slightly different designs at different sizes. E.g. the number of holes in a punchhole card, or the number of circuits on a board might be different. For example I could imagine a chip in the center with a lambda embossed, where circuits connect it to the other Wikimedia projects, with highly stylized logos - just a rectangle for Wikidata, a triangle for Wikivoyage, nine little rectangles for Wiktionary, etc. And the smaller version is just the lambda with lines going out, etc. There is no need to follow the Wikimedia colors. But there is also no need not to. I think the incorporation of these colors in the Wikidata logo worked out beautifully. But the logo of Wikisource is no worse due to not incorporating the colors. One proposal is already on wiki. I am adding my own proposal, because it is so easy to create: a backslash, a lambda, and a slash, looking like a W in many fonts: \λ/ [image: 😉] <https://meta.wikimedia.org/wiki/File:Emojione_1F609.svg> I am looking forward to your ideas, and just want to start the conversation and the brainstorming here. The goal is to have discussions and ideas, and everyone should feel comfortable to borrow from each other in creating proposals for the first round. Let’s discuss possible inspirations and ideas for the next few weeks. I would love this to be much less a competition or a contest with a single winner, and much more a collaboration and cooperation, where, in the end, we all win. Here are also some great words of consideration by Zack McCune <https://meta.wikimedia.org/wiki/User:ZMcCune_(WMF)>: Logos are visual tools. A logo uses recurring graphic elements (made of colors, shapes, icons) to represent a company, initiative, project, or organization without any words. When used in combination with a formal name, the precise arrangement of the logo and name are called a brand lock-up. (Here’s a handy set of examples from the University of Indiana in the United States https://brand.iu.edu/design/logos-lockups/marketing-lockups.html) Making a great logo is about balancing a few graphic design factors: 1. Recognition - Can people understand what is depicted? 2. Association - Do the graphic elements communicate the project or company’s purpose? Do they link the project or company to a family of related brands? Does the design suggest connection to a theme, object, or process essential to the project or company? 3. Originality - Is the graphic unique enough to not be confused with existing projects or companies? 4. Versatility - Will the graphic work BIG and small? Can people use the graphic easily (e.g. place it correctly in new designs, adapt it for wide application), and use it quickly possibly even drawing or approximating it? In the Wikimedia world, association is one of the most important qualities for a new project logo. There is much recognition (and love, which marketers call “affinity”) for Wikipedia and Wikimedia projects and referencing graphic elements (colors, shapes, symbols) from these existing logos will help BUILD upon their reputations. As you look to make a logo, TRY LOTS OF APPROACHES. Collect symbols associated with the topic. Look for patterns/trends in the symbol sets. Universities, for example, often using heraldic design traditions <https://www.universityaffairs.ca/features/feature-article/the-coat-of-arms-…> to suggest history and credibility, but are also criticized as out-dated and overly western. Consider where you want to be original and where you want to use associations. Noun Project <https://en.wikipedia.org/wiki/en:The_Noun_Project> searches can be helpful for finding symbols linked to phrases or topics. Logos also have stories embedded in them. Consider how the artwork you are making will be part of future conversations. The Wikidata logo, for example, intentionally resembles a bar code. This is quickly apparent and communicates the project’s purpose. But there is more to learn about the logo… and that story deepens the care and joy people have for this brand. Simplicity in design is hard. Sketch many many ideas! Try to limit yourself to a small drawing area or with a fixed stroke width to make ideas even more minimal. Also: do not *start* with color. Logos are registered in black and white and must function as memorable single-color expressions to be effective across many use cases (e.g. on t-shirts, on slides, on multi-colored backgrounds). Got a set of logo ideas? Great! You should also TEST logo ideas with friends, family, colleagues, and *ideally* the audience of people you WANT to have a functional/emotional reaction to the design. Feedback will help ideas improve and grow towards highly memorable and effective graphic design. Further design guidance now at the Logo page <https://meta.wikimedia.org/wiki/Logo> is adapted from the great advice originally captured <https://www.mediawiki.org/w/index.php?title=Project:Proposal_for_changing_l…> by my colleague Volker <https://en.wikipedia.org/wiki/User:Volker_E.>. The process will take a while, and we explicitly reserve the freedom to modify it as we go if we see that things are not working out as planned. And our first step is to come up with a sizeable number of ideas and proposals, before we start to vote and whittle them down to a small set of core ideas that in turn can be expanded upon. Join the Wikifunctions logo competition <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wikifunctions_logo_conce…> ! I am looking forward to what you come up with!

3 2

Newsletter #13: Welcome, Cory!
by Denny Vrandečić 08 Jan '21

08 Jan '21

The on-wiki version of this newsletter is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-01-07 We are very happy to welcome Cory Massaro, who is joining the Abstract Wikipedia team as a software engineer. I will let Cory introduce himself in his own words: “I obtained my B.A. in creative writing and classical languages in 2010, then studied comparative literature with a focus on oral history and epic before switching to computational linguistics. I then spent several years working on speech recognition systems and internationalization. More recently, I’ve taken on mentorship and educational roles with a particular focus on helping entry-level workers build class consciousness. “My core passions are teaching/learning, the arts, language, and the relationship of these to workers’ rights. If there is a throughline uniting these interests, it is this: I care deeply about knowledge. I want to contribute to a culture where people possess sufficient education, leisure, and resources to relate to knowledge and act on it. To these ends, I have contributed research analyzing how ideology can create dialect. I also conduct workshops on political rhetoric. In my free time, I write, learn languages, create text adventures, and play music. “I discovered programming as a matter of necessity. Many career paths seemed unreachable due to student loans, while the for-profit tech sector offered a path to stability. Only later did it become clear that values-driven tech work existed. “For this reason, I’m delighted to be joining an organization built on principles of freedom, openness, and cultural/linguistic pluralism. I can’t wait to help make knowledge more accessible and am excited to learn more about this fundamentally democratic approach to language and community engagement. Plus, if there is a modern global analog to the ancient keepers of oral history, Wikimedia is probably it, and that is beautiful.” Please, join me in welcoming Cory to the team! Also, as you may already know, in December we finalised the name for the new Wikimedia wiki. The name for the repository of algorithms will be Wikifunctions, as voted for by the community. The domain name wikifunctions.org has been secured. The next step will be to start the process to select the logo concept for Wikifunctions, which is slowly kicking up its speed. Submissions can be made here <https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Logo>, and in the following days we will publish a few supporting documents and re-do the timeline for the process in order to make it a bit simpler. This is similar to the approach used for the Wikidata logo. I am very much looking forward to seeing the process play out!

1 0

Wiki of functions naming contest
by Nick Wilson (Quiddity) 22 Dec '20

22 Dec '20

Hi all. The details of the re-naming process for the project currently known as Wikilambda are now at https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Wiki_of_functions_naming… I'd appreciate your questions or feedback onwiki! I've added some additional notes on the talkpage there, too.[1] Over the next 2 weeks I'll send out wider reminders for people to continue proposing alternative names. The first round of voting will begin on 29 September. Please add any comments about specific names into the new locations, linked there. I'll be working to migrate content from the old talkpage, too. I'll update this thread with each of the next steps over the coming weeks. Thanks as always, Nick / Quiddity [1] https://meta.wikimedia.org/wiki/Talk:Abstract_Wikipedia/Wiki_of_functions_n… And also some comments at https://meta.wikimedia.org/wiki/Talk:Abstract_Wikipedia/Name#Naming_process…

3 5

Last newsletter for 2020
by Denny Vrandečić 18 Dec '20

18 Dec '20

The on-wiki version of the newsletter is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2020-12-18 ---- This is the last newsletter for 2020. We will take a break in writing the newsletter until January. We wish everyone happy holidays and a good start into 2021, a year which will see the launch of a new Wikimedia project. Yay! Last week saw the publication of a conversation between Heather Ford <https://en.wikipedia.org/wiki/Heather_Ford>, Professor at the University of Technology, Sydney, and Denny Vrandečić. They discuss the potential advantages but also ethical challenges that Abstract Wikipedia will face: Automated Facts, Data Contextualization and Knowledge Colonialism <https://bigdatasoc.blogspot.com/2020/12/automated-facts-data-contextualizat…> We presented Abstract Wikipedia at the Wikimedia Ukraine 2020 conference <https://ua.wikimedia.org/wiki/%D0%92%D1%96%D0%BA%D1%96%D0%BA%D0%BE%D0%BD%D1…>. A recording of the presentation is available here, using slides translated to Ukrainian: video <https://youtube.com/watch?v=tYMQVpUuKks&t=6910> There was also a talk at the Fall 2020 edition of the SMWCon <https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2020> in Vienna. That talk went through the history of Semantic MediaWiki to Wikidata and was focused more around the usage of Semantic MediaWiki as a knowledge tool - and how a library of functions can be helpful with that. keynote <https://www.youtube.com/watch?v=EosGUoucthg> SMWCon also featured a panel with Lydia Pintscher <http://www.lydiapintscher.de/about.php>, Markus Krötzsch <https://iccl.inf.tu-dresden.de/web/Markus_Kr%C3%B6tzsch/en>, and others, discussing the interaction between Semantic MediaWiki and Knowledge Representation - a topic that will be crucial also for Abstract Wikipedia: panel <https://www.youtube.com/watch?v=-Lp4azOUX1Y> Happy new Gregorian year!

1 0

2024

2023

2022

2021

2020

Abstract-Wikipedia