July 2020 - Abstract-Wikipedia - lists.wikimedia.org

Who runs the Twitter account?
by Sam Christian 11 Jul '20

11 Jul '20

I saw that there is a twitter account (<https://twitter.com/wikilambda>). Who is this run by? Just wondering for purposes of transparency and because I see it's been up since March. -- Sam-2727

2 2

Early discussion topics
by Denny Vrandečić 08 Jul '20

08 Jul '20

Hello all, As promised, here’s an initial, draft list of topics that we will need to cover in the next few months. Think of this as a sneak preview - we can’t have all the conversations at once, but want to make sure that you know that all of them will be covered and that we indeed cover all of them, and that we will get to them. Obviously, if you want to discuss things earlier, by all means, feel free, but when we’re prioritizing answers and so on, we’ll try to follow this outline in order to keep conversations more manageable. Some of these might turn out to be rather quick, others might take considerably longer. If there’s something missing, please feel free to suggest it, so that it doesn’t get lost. This list will also be on wiki for easy reference and as an index of topics: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Early_discussion_topics The order of topics will also be worked on together on wiki with you. Here’s the list: - Which communication channels we want to use (e.g. list, Wikispore, Meta, etc.) - Naming and logo of the project and its components - Data and evaluation model for the code repository (Wikilambda) - How to effectively communicate about the project (onboard potential contributors, have easy to understand documentation and communication channels, etc) - Early planning of code development work - Licenses for the different parts of the project - Ethical considerations, particularly regarding diversity, representation and sustainability - Ethical considerations regarding AI aspects of the project - How to incorporate external expert knowledge and advice - How to invoke the code repository from other Wikimedia projects - Theoretical and practical underpinnings of the natural language generation - Security and safety models for the code repository - Early community seeding considerations - Safety and Code of Conduct within the new community - Usage of the ontological knowledge in Wikidata - Usage of the lexicographical knowledge in Wikidata - How do we structure our engagement with the Wikipedias, and with which Wikipedias? - How do we reach out to other Wikimedia communities? - How and whether do we reach out to non-Wikimedia communities? - Where should the Abstract Wikipedia content be created and maintained? As said, please regard this as an early draft, and please feel free to raise further topics. On wiki we can start the ordering of these topics, and then, in the upcoming weeks we will start discussing and working through the topics. Also, Adam sent a list of topics, which I already added to the wiki page, and also started a wiki page for that. Thanks for the initiative Adam, it is great to see this enthusiasm! Thank you! Denny

1 0

Ideas for Wikilambda and Abstract Wikipedia
by Adam Sobieski 08 Jul '20

08 Jul '20

I recently created a wiki page for the group to list, expand upon, brainstorm and discuss ideas for the Wikilambda and Abstract Wikipedia projects. Please feel free to edit the wiki page and to discuss it in the page’s discussion area. https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Ideas Best regards, Adam Sobieski

1 0

Wikilambda and Abstract Wikipedia
by Adam Sobieski 08 Jul '20

08 Jul '20

Abstract Wikipedia Mailing List, Hello. Congratulations on launching the new and exciting Wikilambda and Abstract Wikipedia projects. Which Programming Languages? Which programming languages are to be developed with on Wikilambda for use with Abstract Wikipedia? Might a new programming language be useful? Might it be useful to transpile a new programming language into an existing one (e.g. resembling TypeScript and JavaScript)? Models of Function Definitions On the topics of programming language design, while we most often consider functions as being comprised of signatures and bodies, might it be useful to consider models of functions with more components? We could express such things with multiple, nested “{}” brackets or scopes. In some models, a function definition might include a guard, preconditions, a body, and effects. Inspired by PDDL: function foo(…) { guard { /* … */ } preconditions { /* … */ } body { /* … */ } effects { /* … */ } } In other models, a function definition might include requires, a body, and ensures. Inspired by code contracts: function foo(…) { requires { /* … */ } body { /* … */ } ensures { /* … */ } } Such models can be phrased as transformations or mappings from existing languages which separate and organize varieties of content which frequently occur in function body definitions. Structured Comments, Metadata and Versioning Structured content (e.g. XML) in comments is both useful and increasingly popular. Structured comments can provide extensible metadata for functions, ease the searching for functions in large collections, and make possible the automatic generation of documentation. Structured comments could be processed and utilized by the Wikilambda project to provide features. Also, some programming languages (e.g. C#, Java) have the expressiveness of metadata attributes which can adorn methods and their parameters. A useful attribute is [Obsolete] which supports the use of messages to make use of a newer version of a function, e.g. [Obsolete(“Consider use of foo_2 instead.”)]. Via structured comments, attributes, or other means, the capability of adorning functions with versioning-related metadata would simplify versioning scenarios on evolving crowdsourced resources. Namespaces and Modules Namespaces and modules can be useful when organizing large collections of functions. With namespaces or modules, multiple paradigms or ecosystems of functions could coexist in a crowdsourced resource for use in natural language generation scenarios. Global Variables and Runtime Environment API Global variables, such as the currently desired natural language for output, could be provided by the runtime environment. An API could be specified for the runtime environment. Output Streams, Logging and Diagnostic Events When editing / developing Wikilambda content for use on Abstract Wikipedia, it would be convenient to be able to output to multiple streams, to log, or to raise typed events. It would also be useful to be able to aggregate, organize and view these diagnostic outputs with a configurable granularity or verbosity. Developer Experiences with the Rendering of Natural Language A possible user experience for Wikilambda editors / developers is having a means of toggling a developer mode or debugging mode on Abstract Wikipedia such that one could view rendered articles and hover over natural language output to view relevant diagnostic messages in hoverboxes. Another possible user experience for Wikilambda editors / developers is having a means of toggling a developer mode or debugging mode on Abstract Wikipedia such that visual indicators for diagnostic messages would be placed in a margin to the side of rendered natural language output. Best regards, Adam Sobieski

3 4

Use case: generation of short description
by Jakob Voß 07 Jul '20

07 Jul '20

Hi, I want to auto-generate disambiguation description for African politicians to be added to Wikidata, e.g. from the country Mozambique (Q1029) the following descriptions should be generated: Mozambican politician (en) Mosambikanischer Politiker (de) politico mozambicano (it) ... This could be extended to other professions. My questions: - Can anyone point me to data sources where to best look up country adjectives such as "Mozambican"? - Where/how to best store the lexical information for best reuse with other renderers - If a create small renderers for this short descriptions, what architecture do you prefer for best reuse? My just-get-it-done solution would be a set of CSV files and a few lines of Perl code, but maybe this use case can be aligned with Abstract Wikidata to better learn about it. Looking forward to collaborate, Jakob

3 2

Abstract Wikipedia on Hacker News
by Bryan Hilderbrand 07 Jul '20

07 Jul '20

FYI, The Abstract Wikipedia announcement [1] made it on Hacker News [2] with some interesting comments. Congrats all! [1] https://meta.wikimedia.org/wiki/Abstract_Wikipedia/July_2020_announcement [2] https://news.ycombinator.com/item?id=23714875 Regards, Bryan

2 1

Welcome!
by Denny Vrandečić 07 Jul '20

07 Jul '20

Hi all, I just wanted to say a quick welcome to everyone here! This is the mailing list to discuss the Abstract Wikipedia proposal and its development and growth. The proposal is here: https://meta.wikimedia.org/wiki/Abstract_Wikipedia There is a lot to discuss and plan, and I am very excited about this becoming real. I am looking forward to read from all of you! I will start next week to share more myself, thoughts, ideas, and how we could start this off, but feel free to start already! Stay safe, Denny

2 2

Re: [Abstract-wikipedia] Announcing a new sister project (and list)! Welcome, Abstract Wikipedia
by Samuel Klein 07 Jul '20

07 Jul '20

Hello Abstractipedia! There is so much we can do together. I think often of how commercial encyclopedias have shifted to making materials for children; and how essential it is to support the next generation of knowledge sharers and take seriously engaging with their discovery process. Having renderers for simplified (lang), and resolving the age old challenge of how to think of 'simplified' as a toggle or slider for all languages, will be particularly satisfying. SJ P.S. I hope we still get some use out of the other excellent + generative name ;) https://meta.m.wikimedia.org/wiki/Wikilambda 🌍🌏🌎🌑

3 2

Let's get this wiki started - a rough outline
by Denny Vrandečić 07 Jul '20

07 Jul '20

Hello all, Again, welcome! Here’s my first email to this list with my new account and in my new role. Welcome everyone who signed up here, and I am so thankful that you are here. There is a lot we have to do, and I have already seen a few great discussion threads have started, and I will come to join them soon. We are embarking on an ambitious journey, and we don’t know where we will end up, or how we will get there. Our goal is clear: we want to help more people to be able to share in more knowledge in more languages. This is a two-fold goal: in the end we want to be in a place, (1) where more people can read more articles in languages and contexts with which they are comfortable, and also (2) where more people will feel comfortable to contribute to the sum of all knowledge and make their voices heard as part of the global conversation. We want to build a meaningful contribution towards a world of more knowledge equity. This will take time. Years, in fact. And we will take at least one major, but necessary detour, on the way to that goal, that will take us to the creation of a new Wikimedia project that will allow the collaborative creation of a catalog of functions, a form of knowledge assets that has not been a primary concern of the other Wikimedia projects so far. We expected that our first milestone will be this new catalog, which in the proposal was called “Wikilambda”. So, in fact, we will not focus our development work much on Wikipedias and natural language generation and articles during that first part of our journey. And that is quite intentional: as you can see in some of the other threads, there is a huge amount of work that we need to make accessible for the project and figure out how to build on top. We don't want to reinvent wheels, if we can avoid it. Right now, we don’t know yet how exactly we will implement the natural language generation, and how the abstract content will be stored and created and maintained. We are confident it can be done. But there are many open questions around that part, and we will need some time and partnership with experts inside and outside of the Wikimedia communities to work out which path we will explore. And I can see in the discussions on this list already some great suggestions and points - thanks for that!. But the goal is to create, with the library of functions, a foundation that is strong and robust enough to try out different paths, and even if we get it wrong one or two times, we will be able to pivot and explore other paths. To sketch the timeline roughly: the first part of the project will be focused on launching the function wiki and integrating it with the other Wikimedia projects. Only once that is in place, we will start the development work on the natural language generation part, while keeping improving the functional foundation. This first part is expected to take at least a year. This does not mean that we don't want work to happen on the natural language generation part. As is being discussed, it would be great to create an accessible overview of the current state of the art, and to make sure that lessons learned won't be forgotten when we start on this. It is time to survey the landscape of previous work so that we can then make, as a community, confident decisions in which paths to explore. This is a part where I am hoping for considerable input from you. Also, we are starting with a small exploratory core team, much smaller than it was for Wikidata; me, Denny, as project lead, Adam Baso as Director, James Forrester as engineer, and Nick Wilson as documentation specialist. This core team is relying on the support from many others in the Foundation, from other movement bodies, and from the communities. We plan to look for more funding in order to support the work, particularly as things become increasingly concrete during the project development. We are creating a first draft of a list of topics that we will have to discuss in the first few months, and we’ll send it out tomorrow. This list won’t be complete, and please make yourself heard so that we don’t miss important topics. Obviously, we can add topics at any time later too, but we want to try to set expectations and share insight in how we are planning to work on these topics. We think of this as a collaborative project, and we will rely on the movement's expertise. We want to make sure that we listen and work together on this. We know this is crucial for the success of the project. We are committed to have our communication, discussions, and work be in the open, in order to get early feedback and early engagement. But this also means that not every single one of our messages will be polished, that we will surface disagreements within the team, that we will make mistakes, and that we want to be able to correct them and through interacting with you refine our work, our plans, and improve. Please feel free to ask questions and comment on the rough outline for how this project will be working. Thank you! I am super excited, and thank you all for being here, and looking forward to the next few years! Stay safe, Denny

1 0

Knowledge Representation for Natural Language Generation
by Adam Sobieski 07 Jul '20

07 Jul '20

I would like to move this discussion to a new thread for those interested in knowledge representation for natural language generation. Something which interested me about UNL when it was recently brought to my attention was that it expands upon predicate calculus by providing the expressiveness for placing attributes upon objects and relations. I presumed that one could also place them upon expressions (@a4, @a8, @a13) and upon sets of expressions as well (@a14). UNL’ { r1.@a1(o1(icl>domain1).@a2, o2(icl>domain2).@a3).@a4 r2.@a5(o3(icl>domain3).@a6, o4(icl>domain4).@a7).@a8 r3.@a9(o5(icl>domain5).@a10, o6(icl>domain6).@a11, o7(icl>domain7).@a12).@a13 }.@a14 I then considered that, beyond attributes, one could place attribute-value pairs upon objects, relations, expressions and sets of expressions. Something New { r1.[@a1=v1](o1.[@a2=v2], o2.[@a3=v3]).[@a4=v4] r2.[@a5=v5](o3.[@a6=v6], o4.[@a7=v7]).[@a8=v8] r3.[@a9=v9](o5.[@a10=v10], o6.[@a11=v11], o7.[@a12=v12]).[@a13=v13] }.[@a14=v14] Resembling W3C technologies, URI could be used for objects, relations and attributes instead of plain text strings. Beyond attribute-value pairs, it is also possible that objects, relations, expressions, and sets of expressions could each be as objects in the sense of object graphs (or potentially semantic graphs). In the Wikipedia article about UNL, there is the example: “The sky is blue?!”. The Wikipedia article indicates the tabular representation in UNL for that utterance to be: aoj( blue(icl>color).@entry.@past.@interrogative.@exclamation , sky(icl>natural world).@def) In UNL’, a tabular representation for that utterance might be: aoj.@past( blue(icl>color).@entry , sky(icl>natural world).@def).@interrogative.@exclamation In the new knowledge representation format, we could, resembling the mapping of valueless attributes from HTML to XHTML, map the valueless attributes to Boolean-valued attributes and assign them the value of true: aoj.[@past=true]( blue(icl>color).[@entry=true] , sky(icl>natural world).[@def=true]).[@interrogative=true].[@exclamation=true] We could also map semantics to other attribute-value pairs, for example: aoj.[@tense=past](…) I am finding these topics to be interesting. I am also presently considering an object model for the new knowledge representation format, resembling how the DOM conveniences developers working with XML. Any thoughts on these ideas or about knowledge representation for natural language generation in general? Best regards, Adam Sobieski

2 2

2024

2023

2022

2021

2020

Abstract-Wikipedia July 2020