Thank you.
Sent from Yahoo Mail for iPhone
On Friday, March 17, 2023, 11:27 AM, Thomas Douillard <thomas.douillard(a)gmail.com>
wrote:
About the "discussion" tab on Wikidata items : it now includes the template
"item documentation" through automated headers, so the (in)existence of the talk
page might not be a good proxy to know if the discussion tab is actually used or not. Hard
to know if the community uses it a lot though.
Elaborating on that, the fact that it’s hidden on the talk page might make it not used as
often as it could be. It might be a lesson for abstract content, how its existence is
discovered, how visible it is. It might be a big factor of how the project gain traction
and builds a community. If the content is not easily discovered and seems buried in some
part that looks like the technical plumbing of the Wikimedia movement — in the wikidata
cases, in the beginning, some non enthusiastic contributors could refer to Wikidata, not
in a good way, to a "database". Abstract Wikipedia deserves a shiny portal and a
good discoverability of the content, which might be a different discussion even.
Le jeu. 16 mars 2023 à 03:44, Denny Vrandečić <dvrandecic(a)wikimedia.org> a écrit :
The online version of this newsletter can be found
here:https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2023-03-15
Quo vadis, Abstract Wikipedia?
Abstract Wikipedia will allow more people to contribute their voices to a baseline of
knowledge, whilst working in an interface in their language. This shared baseline of
knowledge will then be made available in many languages. This will be achieved by
creating, storing, and maintaining the baseline of knowledge, per individual article, in a
notation that is independent of natural language. We call content written in that notation
"abstract content". This abstract content will then be turned into text in a
specific natural language, with the help of the library of functions from Wikifunctions.
Thus, Abstract Wikipedia will allow a speaker of any language to contribute content for
readers of many different languages. This will allow more people to read more content in
their language.
Example
Assume that we want to create a new Wikipedia article for the planet Jupiter, and the
first version of this article shall be the following (these are the first two sentences of
the Simple English Wikipedia article):
- Jupiter is the largest planet in the Solar System. It is the fifth planet from the
Sun.
The abstract content representing this natural text could look like this:
Article(
text: [
Superlative(
subject: Jupiter,
quality: large,
class: planet,
location constraint: Solar System),
Definition(
subject: Jupiter,
definition: Rank(
rank: Positive integer(
value: 5),
object: planet,
by: Relational noun(
noun: distance,
to: Sun)))],
categories: [Jupiter, planet, Solar System])
Wikifunctions would have types for Article, Superlative, Definition, Rank, etc., which are
used here as the abstract notation for the content of these two sentences. This abstract
content is shown using labels in English here for our convenience, whereas in fact they
would all be ZIDs (from Wikifunctions) and QIDs (from Wikidata). There will be software
components to provide for the viewing, creation, and editing of abstract content.
Wikifunctions will then also provide functions that take this object as an argument and
generate natural language text such as the above.
One question that needs to be answered is where these objects would be stored, and how to
associate the above object with the Wikidata item for Jupiter, Q319. We were originally
planning to have this conversation and decision before the launch of Wikifunctions, but
looking at the complexity of the system and the fact that it is very difficult to imagine
given that so little of it is tangible so far, we decided not to open this question for
discussion now, but to wait until after the launch of Wikifunctions, when we will all have
a better understanding of how that part of the ecosystem works.
Below, we outline a few options that came up in the discussion between folks on the
Abstract Wikipedia team and the Wikidata team at Wikimedia Deutschland. It also ties to
some of the questions Lydia Pintscher and I were answering in an interview on the Wikimove
podcast episode that was released today and that we invite you to listen to. Thanks to
Nicole Ebber and Nikki Zeuner for the interview!
We are genuinely undecided about the best answer, and we would benefit from a wider
discussion of the options, and potentially other options as well. Please also ask
questions - these can often clarify and shine light on points that are muddy to us as
well. We currently are focusing on the following five options:
- A new tab on items in Wikidata
- Create a new data type for objects on Wikidata
- Objects on Wikifunctions
- Objects on a new Wikipedia language edition
- Unattached namespace on an existing project
Let’s discuss these five options in the following.
Option 1: A new tab on items in Wikidata
We could add a tab, leading to a new namespace on Wikidata with a new content type, where
the abstract content would live. This namespace would be attached to the item namespace in
Wikidata. This way, every abstract content would have a natural place to store its
associated abstract content.
Given the little use of the item talk pages on Wikidata, it seems an additional talk page
may not be valuable, so we might want to redirect people to the item's main talk page
for any discussions.
One big question would be where to store content for Wikivoyage and other projects about
the same items (e.g. the abstract content we might want to write about Q90/Paris from the
perspective of Wikipedia would be different from that for Wikivoyage). Would that need yet
another namespace? Adding a new associated namespace would be a technical challenge;
adding several would be a complex task, and we hope that not to happen often.
Option 2: Create a new data type for Wikifunction objects
We could create a new data type on Wikidata for Wikifunctions objects. Then the community
could create a property on Wikidata that stores the abstract content on a given item as a
literal. This would have the added flexibility that the community could add more abstract
contents to an item for specific use cases, e.g. to represent content from Wikivoyage, or
to represent the history of an item, the etymology of a word, etc.
We will need such a datatype and the UX for it anyway, given our planned support
for abstract descriptions. In fact, this might be the simplest way to support abstract
description.
However, the UX of Wikidata doesn't lend itself to this easily, and adapting to this
model would be challenging given the constraints of an item page. Existing properties are
edited as one or a small number of simple, short text boxes, often with auto-complete;
abstract content would instead be a larger text area, with helpers and possibly a toolbar,
a preview control, etc.. One option could in principle be a modal dialog for editing, but
these come with their own inherent UX downsides and are usually more complex to implement
than the same functionality in its own dedicated environment. Also, this would break the
current design patterns of an item page, and may not be aligned with the patterns that
might be planned for its future.
Further, while abstract content is somewhat structured and machine-readable, it is less
(or differently) so than an Item, and its structure would probably not be queryable with
SPARQL.
This option comes with two additional challenges: some Wikidata items, already nearing the
maximum size, would grow still further, and we would need a solution to allow that, and
second how to deal with the visualization, editing, and diffing of potentially very large
and complex values.
Option 3: Objects on Wikifunctions
Instead of having objects live in Wikidata as the values of statements or on an additional
tab, Wikidata could merely store a pointer to an object on Wikifunctions. We still would
create a new data type, but that data type would be just the ZID of an object stored in
Wikifunctions.
This would solve all the challenges of the previous option, and retain many of its
advantages.
It would have the additional advantage that several items could refer to the same object
on Wikifunctions. Whereas this sounds rarely useful for the abstract content of Wikipedia
articles, this might prove very useful for abstract descriptions of Wikidata items and the
abstract glosses of lexemes. With the creation of abstract content, types, and natural
language generation functions on the same platform, collaboration between people who focus
on one of these areas would be more direct.
This option could be a challenge for the Wikifunctions community. The scope would expand
to cover content as well as functions. This could make it difficult for the smaller
community, and need more community patrollers like those already active on Wikidata.
Option 4: Objects on a new Wikipedia language edition
We could launch a new Wikipedia language edition in which the main namespace is abstract
content. This could be called e.g. the multi-lingual Wikipedia (
mul.wikipedia.org)
language edition, or the abstract Wikipedia language edition (
abstract.wikipedia.org).
Like all other language editions, the pages are connected to the items via sitelinks in
Wikidata.
If Wikivoyage wanted to use the same approach, they would need to copy the setup and
create a multi-lingual Wikivoyage edition (or, as Option 4B, perhaps these could be a
different namespace on a single shared ‘abstract’ content wiki?).
This would give a very clear distinction of what is Wikipedia content and what is not, and
give the abstract content a distinct visibility, which would otherwise be somewhat hidden
between Wikifunctions and Wikidata. On the other hand, it would splinter the communities
further, and mix in a "new" concept of a wiki that isn't really a Wikipedia
but is labelled as such.
Option 5: Unattached namespace on existing project
We could introduce a new namespace to an existing project where the abstract content for
Wikipedia would effectively live. Here are the most likely options:
- Wikidata (as a separate namespace, not attached to the items)
- Commons
- Meta
- English Wikipedia (not attached to the articles)
Whereas technically all these options would be the same, they would be extremely different
from a social and community perspective. We will refrain here from discussing these
differences for now, unless this starts becoming a more likely option.
Rant: One Wikimedia movement, or many projects?
Conway’s law states that software mirrors organization, or, as it was put, if you put
three teams to work on a compiler, you get a three-pass compiler. The opposite is also
true, as previously observed by Guillaume Paumier: a software system determines the
community structure that evolves around the system. Within the Wikimedia projects, we
often see this effect in the dynamics between the different Wikipedia language
communities, the Wikidata community, the Commons community, Meta, etc. The stories of
local protection of media files on Commons, of wikis opting-out from global anti-abuse
tools, or of short descriptions in English Wikipedia should be warning enough.
The question we ask today is not only hard because there are genuine technical challenges
that we have to overcome, and we have to make a tradeoff. It is additionally so much
harder because we can anticipate that there will be fracture lines between the different
projects, and maybe even anticipate how these fracture lines would shape out. The whole
story would be so much easier if we would, in general, regard ourselves as being part of
one common movement, as one large community. But I doubt that we will see much progress on
this trajectory before we have to resolve the above question.
Until then we only can remain mindful about the possible solutions, their effects, and
that we should be careful to design for the world as it is and as it likely will be, and
not for the world we wish to be.
Public NLG workstream on Tuesday
On Tuesday, March 21, there will be the third public NLG workstream meeting on JITSI. Feel
free to reach out to me and suggest presentations beforehand if you want. We have a bit
already planned, but there is still space. The meeting is 16:30-17:30 UTC, which is an
hour off for US friends.
_______________________________________________
Abstract-Wikipedia mailing list -- abstract-wikipedia(a)lists.wikimedia.org
List information:
https://lists.wikimedia.org/postorius/lists/abstract-wikipedia.lists.wikime…
_______________________________________________
Abstract-Wikipedia mailing list -- abstract-wikipedia(a)lists.wikimedia.org
List information:
https://lists.wikimedia.org/postorius/lists/abstract-wikipedia.lists.wikime…