I saw that there is a twitter account (<https://twitter.com/wikilambda>).
Who is this run by? Just wondering for purposes of transparency and because
I see it's been up since March.
-- Sam-2727
Hello all,
As promised, here’s an initial, draft list of topics that we will need to
cover in the next few months. Think of this as a sneak preview - we can’t
have all the conversations at once, but want to make sure that you know
that all of them will be covered and that we indeed cover all of them, and
that we will get to them. Obviously, if you want to discuss things earlier,
by all means, feel free, but when we’re prioritizing answers and so on,
we’ll try to follow this outline in order to keep conversations more
manageable.
Some of these might turn out to be rather quick, others might take
considerably longer.
If there’s something missing, please feel free to suggest it, so that it
doesn’t get lost. This list will also be on wiki for easy reference and as
an index of topics:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Early_discussion_topics
The order of topics will also be worked on together on wiki with you.
Here’s the list:
- Which communication channels we want to use (e.g. list, Wikispore, Meta,
etc.)
- Naming and logo of the project and its components
- Data and evaluation model for the code repository (Wikilambda)
- How to effectively communicate about the project (onboard potential
contributors, have easy to understand documentation and communication
channels, etc)
- Early planning of code development work
- Licenses for the different parts of the project
- Ethical considerations, particularly regarding diversity, representation
and sustainability
- Ethical considerations regarding AI aspects of the project
- How to incorporate external expert knowledge and advice
- How to invoke the code repository from other Wikimedia projects
- Theoretical and practical underpinnings of the natural language generation
- Security and safety models for the code repository
- Early community seeding considerations
- Safety and Code of Conduct within the new community
- Usage of the ontological knowledge in Wikidata
- Usage of the lexicographical knowledge in Wikidata
- How do we structure our engagement with the Wikipedias, and with which
Wikipedias?
- How do we reach out to other Wikimedia communities?
- How and whether do we reach out to non-Wikimedia communities?
- Where should the Abstract Wikipedia content be created and maintained?
As said, please regard this as an early draft, and please feel free to
raise further topics. On wiki we can start the ordering of these topics,
and then, in the upcoming weeks we will start discussing and working
through the topics.
Also, Adam sent a list of topics, which I already added to the wiki page,
and also started a wiki page for that. Thanks for the initiative Adam, it
is great to see this enthusiasm!
Thank you!
Denny
I recently created a wiki page for the group to list, expand upon, brainstorm and discuss ideas for the Wikilambda and Abstract Wikipedia projects. Please feel free to edit the wiki page and to discuss it in the page’s discussion area.
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Ideas
Best regards,
Adam Sobieski
Abstract Wikipedia Mailing List,
Hello. Congratulations on launching the new and exciting Wikilambda and Abstract Wikipedia projects.
Which Programming Languages?
Which programming languages are to be developed with on Wikilambda for use with Abstract Wikipedia? Might a new programming language be useful? Might it be useful to transpile a new programming language into an existing one (e.g. resembling TypeScript and JavaScript)?
Models of Function Definitions
On the topics of programming language design, while we most often consider functions as being comprised of signatures and bodies, might it be useful to consider models of functions with more components? We could express such things with multiple, nested “{}” brackets or scopes.
In some models, a function definition might include a guard, preconditions, a body, and effects.
Inspired by PDDL:
function foo(…)
{
guard { /* … */ }
preconditions { /* … */ }
body { /* … */ }
effects { /* … */ }
}
In other models, a function definition might include requires, a body, and ensures.
Inspired by code contracts:
function foo(…)
{
requires { /* … */ }
body { /* … */ }
ensures { /* … */ }
}
Such models can be phrased as transformations or mappings from existing languages which separate and organize varieties of content which frequently occur in function body definitions.
Structured Comments, Metadata and Versioning
Structured content (e.g. XML) in comments is both useful and increasingly popular. Structured comments can provide extensible metadata for functions, ease the searching for functions in large collections, and make possible the automatic generation of documentation.
Structured comments could be processed and utilized by the Wikilambda project to provide features.
Also, some programming languages (e.g. C#, Java) have the expressiveness of metadata attributes which can adorn methods and their parameters. A useful attribute is [Obsolete] which supports the use of messages to make use of a newer version of a function, e.g. [Obsolete(“Consider use of foo_2 instead.”)].
Via structured comments, attributes, or other means, the capability of adorning functions with versioning-related metadata would simplify versioning scenarios on evolving crowdsourced resources.
Namespaces and Modules
Namespaces and modules can be useful when organizing large collections of functions. With namespaces or modules, multiple paradigms or ecosystems of functions could coexist in a crowdsourced resource for use in natural language generation scenarios.
Global Variables and Runtime Environment API
Global variables, such as the currently desired natural language for output, could be provided by the runtime environment. An API could be specified for the runtime environment.
Output Streams, Logging and Diagnostic Events
When editing / developing Wikilambda content for use on Abstract Wikipedia, it would be convenient to be able to output to multiple streams, to log, or to raise typed events. It would also be useful to be able to aggregate, organize and view these diagnostic outputs with a configurable granularity or verbosity.
Developer Experiences with the Rendering of Natural Language
A possible user experience for Wikilambda editors / developers is having a means of toggling a developer mode or debugging mode on Abstract Wikipedia such that one could view rendered articles and hover over natural language output to view relevant diagnostic messages in hoverboxes.
Another possible user experience for Wikilambda editors / developers is having a means of toggling a developer mode or debugging mode on Abstract Wikipedia such that visual indicators for diagnostic messages would be placed in a margin to the side of rendered natural language output.
Best regards,
Adam Sobieski
Hi,
I want to auto-generate disambiguation description for African
politicians to be added to Wikidata, e.g. from the country Mozambique
(Q1029) the following descriptions should be generated:
Mozambican politician (en)
Mosambikanischer Politiker (de)
politico mozambicano (it)
...
This could be extended to other professions. My questions:
- Can anyone point me to data sources where to best look up country
adjectives such as "Mozambican"?
- Where/how to best store the lexical information for best reuse with
other renderers
- If a create small renderers for this short descriptions, what
architecture do you prefer for best reuse?
My just-get-it-done solution would be a set of CSV files and a few lines
of Perl code, but maybe this use case can be aligned with Abstract
Wikidata to better learn about it.
Looking forward to collaborate,
Jakob
Hi all,
I just wanted to say a quick welcome to everyone here!
This is the mailing list to discuss the Abstract Wikipedia proposal and its
development and growth. The proposal is here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia
There is a lot to discuss and plan, and I am very excited about this
becoming real. I am looking forward to read from all of you! I will start
next week to share more myself, thoughts, ideas, and how we could start
this off, but feel free to start already!
Stay safe,
Denny
Hello Abstractipedia! There is so much we can do together.
I think often of how commercial encyclopedias have shifted to making
materials for children; and how essential it is to support the next
generation of knowledge sharers and take seriously engaging with their
discovery process.
Having renderers for simplified (lang), and resolving the age old challenge
of how to think of 'simplified' as a toggle or slider for all languages,
will be particularly satisfying.
SJ
P.S. I hope we still get some use out of the other excellent + generative
name ;)
https://meta.m.wikimedia.org/wiki/Wikilambda
🌍🌏🌎🌑
Hello all,
Again, welcome!
Here’s my first email to this list with my new account and in my new role.
Welcome everyone who signed up here, and I am so thankful that you are
here. There is a lot we have to do, and I have already seen a few great
discussion threads have started, and I will come to join them soon.
We are embarking on an ambitious journey, and we don’t know where we will
end up, or how we will get there. Our goal is clear: we want to help more
people to be able to share in more knowledge in more languages. This is a
two-fold goal: in the end we want to be in a place, (1) where more people
can read more articles in languages and contexts with which they are
comfortable, and also (2) where more people will feel comfortable to
contribute to the sum of all knowledge and make their voices heard as part
of the global conversation. We want to build a meaningful contribution
towards a world of more knowledge equity.
This will take time. Years, in fact. And we will take at least one major,
but necessary detour, on the way to that goal, that will take us to the
creation of a new Wikimedia project that will allow the collaborative
creation of a catalog of functions, a form of knowledge assets that has not
been a primary concern of the other Wikimedia projects so far.
We expected that our first milestone will be this new catalog, which in the
proposal was called “Wikilambda”. So, in fact, we will not focus our
development work much on Wikipedias and natural language generation and
articles during that first part of our journey. And that is quite
intentional: as you can see in some of the other threads, there is a huge
amount of work that we need to make accessible for the project and figure
out how to build on top. We don't want to reinvent wheels, if we can avoid
it.
Right now, we don’t know yet how exactly we will implement the natural
language generation, and how the abstract content will be stored and
created and maintained. We are confident it can be done. But there are many
open questions around that part, and we will need some time and partnership
with experts inside and outside of the Wikimedia communities to work out
which path we will explore. And I can see in the discussions on this list
already some great suggestions and points - thanks for that!. But the goal
is to create, with the library of functions, a foundation that is strong
and robust enough to try out different paths, and even if we get it wrong
one or two times, we will be able to pivot and explore other paths.
To sketch the timeline roughly: the first part of the project will be
focused on launching the function wiki and integrating it with the other
Wikimedia projects. Only once that is in place, we will start the
development work on the natural language generation part, while keeping
improving the functional foundation. This first part is expected to take at
least a year.
This does not mean that we don't want work to happen on the natural
language generation part. As is being discussed, it would be great to
create an accessible overview of the current state of the art, and to make
sure that lessons learned won't be forgotten when we start on this. It is
time to survey the landscape of previous work so that we can then make, as
a community, confident decisions in which paths to explore. This is a part
where I am hoping for considerable input from you.
Also, we are starting with a small exploratory core team, much smaller than
it was for Wikidata; me, Denny, as project lead, Adam Baso as Director,
James Forrester as engineer, and Nick Wilson as documentation specialist.
This core team is relying on the support from many others in the
Foundation, from other movement bodies, and from the communities. We plan
to look for more funding in order to support the work, particularly as
things become increasingly concrete during the project development.
We are creating a first draft of a list of topics that we will have to
discuss in the first few months, and we’ll send it out tomorrow. This list
won’t be complete, and please make yourself heard so that we don’t miss
important topics. Obviously, we can add topics at any time later too, but
we want to try to set expectations and share insight in how we are planning
to work on these topics.
We think of this as a collaborative project, and we will rely on the
movement's expertise. We want to make sure that we listen and work together
on this. We know this is crucial for the success of the project. We are
committed to have our communication, discussions, and work be in the open,
in order to get early feedback and early engagement. But this also means
that not every single one of our messages will be polished, that we will
surface disagreements within the team, that we will make mistakes, and that
we want to be able to correct them and through interacting with you refine
our work, our plans, and improve.
Please feel free to ask questions and comment on the rough outline for how
this project will be working. Thank you!
I am super excited, and thank you all for being here, and looking forward
to the next few years!
Stay safe,
Denny
I would like to move this discussion to a new thread for those interested in knowledge representation for natural language generation.
Something which interested me about UNL when it was recently brought to my attention was that it expands upon predicate calculus by providing the expressiveness for placing attributes upon objects and relations. I presumed that one could also place them upon expressions (@a4, @a8, @a13) and upon sets of expressions as well (@a14).
UNL’
{
r1.@a1(o1(icl>domain1).@a2, o2(icl>domain2).@a3).@a4
r2.@a5(o3(icl>domain3).@a6, o4(icl>domain4).@a7).@a8
r3.@a9(o5(icl>domain5).@a10, o6(icl>domain6).@a11, o7(icl>domain7).@a12).@a13
}.@a14
I then considered that, beyond attributes, one could place attribute-value pairs upon objects, relations, expressions and sets of expressions.
Something New
{
r1.[@a1=v1](o1.[@a2=v2], o2.[@a3=v3]).[@a4=v4]
r2.[@a5=v5](o3.[@a6=v6], o4.[@a7=v7]).[@a8=v8]
r3.[@a9=v9](o5.[@a10=v10], o6.[@a11=v11], o7.[@a12=v12]).[@a13=v13]
}.[@a14=v14]
Resembling W3C technologies, URI could be used for objects, relations and attributes instead of plain text strings.
Beyond attribute-value pairs, it is also possible that objects, relations, expressions, and sets of expressions could each be as objects in the sense of object graphs (or potentially semantic graphs).
In the Wikipedia article about UNL, there is the example: “The sky is blue?!”. The Wikipedia article indicates the tabular representation in UNL for that utterance to be:
aoj( blue(icl>color).@entry.@past.@interrogative.@exclamation , sky(icl>natural world).@def)
In UNL’, a tabular representation for that utterance might be:
aoj.@past( blue(icl>color).@entry , sky(icl>natural world).@def).@interrogative.@exclamation
In the new knowledge representation format, we could, resembling the mapping of valueless attributes from HTML to XHTML, map the valueless attributes to Boolean-valued attributes and assign them the value of true:
aoj.[@past=true]( blue(icl>color).[@entry=true] , sky(icl>natural world).[@def=true]).[@interrogative=true].[@exclamation=true]
We could also map semantics to other attribute-value pairs, for example:
aoj.[@tense=past](…)
I am finding these topics to be interesting. I am also presently considering an object model for the new knowledge representation format, resembling how the DOM conveniences developers working with XML.
Any thoughts on these ideas or about knowledge representation for natural language generation in general?
Best regards,
Adam Sobieski