From the department of the self serving, I will note that their focus for the computational NLU micro theory is pragmatics.

They realized that humans shortcut their language use by not dealing with the details but switching meanings. From the book:

“Regarding the latter, in his analysis of the place of formal semantics in NLP, Wilks (2011) reports a thought-provoking finding about a sentence type that has been discussed exten- sively in the theoretical literature, illustrated by the well-known example John wants to marry a Norwegian. Such sentences have been claimed to have two interpretations: John wants to marry a particular Norwegian (de re), and he wants to marry some Norwegian or other (de dicto). When Wilks carried out an informal web search for the corresponding “wants to marry a German” (since marrying a Norwegian was referenced in too many
linguistics papers), the first twenty hits all had the generic meaning, which suggests that if one wants to express the specific meaning, this turn of phrase is just not used. Wilks argues that computational semantics must involve both meaning representation and “con- crete computational tasks on a large scale” (p. 7). He writes, “What is not real Compsem [computational semantics], even though it continues to masquerade under the name, is a formal semantics based on artificial examples and never, ever, on real computational and implemented processes” (p. 7).”

The “marry a German” research illustrates the need to reflect the real world, but that we also paraphrase until we need to select a meaning that the paraphrase meaning just can’t convey in context. Words don’t matter, meaning does - see emojis. I submit that lexeme should define a concept meaning not a word meaning. We use words, but communicate concepts. The NLU computational construct must reflect the real use of language.

Further, for a knowledge representation, words and N-grams are not knowledge carriers, concepts in context are the real world knowledge carriers. We should probably model our NLU solution the same. 

Tagged words and word proximity computing are a path that approximates how we choose meaning, but fail in generating language due to the approximating nature and lack of real context. We can, however, use that computational approach to build a mirror of human communication (a pragmatic solution) by vectorizing the real world use and de-duping (paraphrase grouping). The result is also computational. Over time, we maintain meaning vectors and paraphrase groups to better reflect real world human use. We de-emphasize words and concentrate on concepts in contexts, just like human communicators do.

Lastly, a pragmatic computational approach will have a natural language interface - even non developers can read and interact with zero training - just concepts and contexts with their associated knowledge items.

Thank you Amirouche, great read for NLP nerds.

On Sat, Oct 23, 2021 at 11:53 AM Amirouche BOUBEKKI <amirouche@hyper.dev> wrote:
It is not my book, but I think it will interest people around here. I think it is an easy ready, it meant to be read by professionals and hobbyist alike. There is no code.

A human-inspired, linguistically sophisticated model of language understanding for intelligent agent systems.

The open access edition of this book was made possible by generous funding from Arcadia – a charitable fund of Lisbet Rausing and Peter Baldwin.

One of the original goals of artificial intelligence research was to endow intelligent agents with human-level natural language capabilities. Recent AI research, however, has focused on applying statistical and machine learning approaches to big data rather than attempting to model what people do and how they do it. In this book, Marjorie McShane and Sergei Nirenburg return to the original goal of recreating human-level intelligence in a machine. They present a human-inspired, linguistically sophisticated model of language understanding for intelligent agent systems that emphasizes meaning—the deep, context-sensitive meaning that a person derives from spoken or written language.

With Linguistics for the Age of AI, McShane and Nirenburg offer a roadmap for creating language-endowed intelligent agents (LEIAs) that can understand,explain, and learn. They describe the language-understanding capabilities of LEIAs from the perspectives of cognitive modeling and system building, emphasizing “actionability”—which involves achieving interpretations that are sufficiently deep, precise, and confident to support reasoning about action. After detailing their microtheories for topics such as semantic analysis, basic coreference, and situational reasoning, McShane and Nirenburg turn to agent applications developed using those microtheories and evaluations of a LEIA's language understanding capabilities.

McShane and Nirenburg argue that the only way to achieve human-level language understanding by machines is to place linguistics front and center, using statistics and big data as contributing resources. They lay out a long-term research program that addresses linguistics and real-world reasoning together, within a comprehensive cognitive architecture.

https://direct.mit.edu/books/book/5042/Linguistics-for-the-Age-of-AI_______________________________________________
Abstract-Wikipedia mailing list -- abstract-wikipedia@lists.wikimedia.org
List information: https://lists.wikimedia.org/postorius/lists/abstract-wikipedia.lists.wikimedia.org/