Cory,
All,
It is good to see that others are also interested in and thinking about these topics.
It is also good to see that eventual debugging and fine-tuning experiences for developers,
editors, and end-users are presently being considered, experiences which involve being
able to inspect and select portions of system output to trace back into source code and
data to make improvements.
It appears that controlling subjectivities and ensuring objectivity are relevant
throughout the stages of natural-language generation: content determination, document
structuring, aggregation, lexical choice, referring expression generation, and
realization. I had been thinking about subjectivities and style with respect to lexical
choice, referring expression generation, and realization (choosing the specific nouns,
verbs, adjectives, and adverbs) and I agree with your point about content determination,
that encyclopedia-article-generating systems could enhance objectivity by determining to
present content from multiple perspectives.
Yes, I would be interested in more information about your cross-linguistic literature
review of Wikis.
Thank you for sharing that publication about neutrosophic logic which describes every
logical variable x as being "described by a triple: x = (t, i, f), where t is the
degree of truth, f is the degree of false, and i is the level of indeterminacy."
Best regards,
Adam
________________________________
From: Cory Massaro <cmassaro(a)wikimedia.org>
Sent: Thursday, November 17, 2022 2:49 PM
To: General public mailing list for the discussion of Abstract Wikipedia and Wikifunctions
<abstract-wikipedia(a)lists.wikimedia.org>
Subject: [Abstract-wikipedia] Re: Generating Natural-language Stories using Wikidata
Historical Data
Hi Adam!
This is a really interesting set of questions; for me, these are the most important
discussions to have around Abstract Wikipedia. I'll clarify whether I'm answering
from the perspective of what the team's currently doing or my own personal views and
desires.
First, there's a natural tension between the AW project and the notion of neutral POV.
Abstract Wikipedia is an intrinsically pluralistic project, amplifying a given linguistic
community's ability to represent their point of view. Neutral POV is intrinsically
hegemonic, as "neutrality" is decided (even if implicitly) by the relative
loudness of voices, and so usually comes down to what's said in a small number of
languages with many, or highly privileged, speakers.
Without a means of controlling and varying the subjectivities of output stories and
language, shouldn't one desire for the output to be as measurably objective as
possible?
WMF perspective: presumably, all information in any Wiki project is conveyed from a
neutral point of view. This policy is what's usually invoked to combat misinformation,
say when a corporate entity or political group brigades a given Wiki and makes highly
biased edits.
I am thinking about whether resultant machine-generated stories would be objectively or
subjectively narrated. These topics appear to pertain to the philosophy of history [1] and
neutrality [2], resembling encyclopedists' ideals of neutrality with respect to point
of view [3].
Personal note: I did a little cross-linguistic literature review of Wikis a while ago,
which I'd be happy to share with you. There are real differences of perspective in
many domains. Some obvious differences concerned contested territories, actively occupied
territories, etc. Others had to do with historical events and political philosophies; some
highlights would be the English Wikipedia article on "Communism" and its Spanish
equivalent, or a comparison between the Arabic Wikipedia's article on the crusades and
that in English or French.
What would be really interesting here (speaking for myself) would be to augment these
articles. Arguably, the most "objective" thing one can do would be to represent
diversity (while trying to avoid misinformation ... somehow ;) ). So what would it look
like if Wikipedia articles told history from multiple perspectives, and Abstract Wikipedia
allowed us to gather those multiple perspectives and add them to the articles in multiple
Wikis?
What do you think about the idea that natural-language story generating systems could use
parameters or additional inputs to vary the subjectivities of the output?
Continuing from the above, and still speaking for myself: it would be interesting to adopt
primitives from epistemic logic in Wikidata. So Wikidata could represent not just a fact,
but the perspective from which that fact is true. This would require a real organized
effort (or a really good polyglot data-mining ML), but it's a way that Abstract
Wikipedia could maintain "neutrality" while also admitting some level of
subjectivity. And again, there are a ton of dangers here, like legitimizing misinformed
(or downright malicious) perspectives. Maybe we then need a "truth score," like
in neutrostrophic
logic<http://fs.unm.edu/IntrodNeutLogic.pdf>?
I don't know. Concretely, I think that something in this area would be a great
vertical to focus on as a pilot for Abstract Wikipedia!
What do you think about providing the capability for developers to be able to trace
backwards from natural-language outputs (from words, phrases, sentences, and paragraphs)
into source code and data? Developers would, then, be able to more readily version
software and data utilizing metrics and evaluation tools, e.g., Grammarly or sentiment
analysis. In theory, systems could provide accompanying “debugging data” alongside
natural-language outputs, this data including mappings from selections of natural
language, wikitext, or hypertext to stack traces or other data structures.
This is part of why Abstract Wikipedia avoids machine learning. As presently conceived,
every step of the NLG process can be inspected. We don't yet package this
"debugging data" up nicely; however, it would absolutely be possible to do so,
and might have applications to prevent abuse--presumably, if somebody misuses Abstract
Wikipedia to produce fake news, this metadata could provide signal to automate detection
and remediation. I'm not an expert on that, though; the disinformation folks at the
WMF would know more.
Happy to chat further!
Best,
Cory
On Thu, Nov 17, 2022 at 6:04 AM Adam Sobieski
<adamsobieski@hotmail.com<mailto:adamsobieski@hotmail.com>> wrote:
Abstract Wikipedia,
Hello. I am recently thinking about the generation of natural-language stories from
Wikidata data, e.g., graphs of interrelated real-world historical events.
I am thinking about whether resultant machine-generated stories would be objectively or
subjectively narrated. These topics appear to pertain to the philosophy of history [1] and
neutrality [2], resembling encyclopedists' ideals of neutrality with respect to point
of view [3].
In my opinion, there would be much to learn from developing natural-language story
generating systems which could have parameters set or which could receive secondary input
data to subsequently produce subjective stories. With such systems, developers could
control and vary the subjectivities of resultant natural-language output, e.g., as
pertaining to sentiment.
What do you think about the idea that natural-language story generating systems could use
parameters or additional inputs to vary the subjectivities of the output?
Without a means of controlling and varying the subjectivities of output stories and
language, shouldn't one desire for the output to be as measurably objective as
possible?
What do you think about providing the capability for developers to be able to trace
backwards from natural-language outputs (from words, phrases, sentences, and paragraphs)
into source code and data? Developers would, then, be able to more readily version
software and data utilizing metrics and evaluation tools, e.g., Grammarly or sentiment
analysis. In theory, systems could provide accompanying “debugging data” alongside
natural-language outputs, this data including mappings from selections of natural
language, wikitext, or hypertext to stack traces or other data structures.
Best regards,
Adam Sobieski
[1]
https://en.wikipedia.org/wiki/Philosophy_of_history
[2]
https://en.wikipedia.org/wiki/Philosophy_of_history#Philosophy_of_neutrality
[3]
https://en.wikipedia.org/wiki/Wikipedia:Neutral_point_of_view
_______________________________________________
Abstract-Wikipedia mailing list --
abstract-wikipedia@lists.wikimedia.org<mailto:abstract-wikipedia@lists.wikimedia.org>
List information:
https://lists.wikimedia.org/postorius/lists/abstract-wikipedia.lists.wikime…