The on-wiki version of this newsletter is here:
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2021-03-03
Hello all,
The Wikidata team at Wikimedia Deutschland will be working on improvements
to the lexicographic data part of Wikidata during this year. The Abstract
Wikipedia team at the Wikimedia Foundation will be working on the
generation of natural language text for baseline Wikipedia articles in the
next few years, and on functions in Wikifunctions to work with
lexicographic data. For these cases, it would be beneficial to focus on a
small specific set of languages at first. Participating communities will
hopefully find that this project leads to long-term growth in Wikipedia and
Wiktionary in and about their language.
Lydia and Denny would like to choose the same focus languages for both of
the teams, as this is beneficial for both projects to have this aligned.
We will be working closely together with the focus communities over the
next few years. This means that features will land first in these languages
and we will have particularly active feedback channels. We are looking for
communities that are open to trying out new things.
The decision of which languages should be the focus languages should be
done together with the wider communities. In particular, we would like to
make the decision with a promising self-selecting community. This worked
very well for Wikidata, where the focus projects were self-selected.
We will use English as a demonstration language and two or three other
languages as focus languages. English is chosen as it is easy to
demonstrate to a wide audience and is a working language for both
development teams.
For the focus languages, we want to work with an active and enthusiastic
community or seed of a community over the next few years on these projects.
In order to be fully transparent, we have compiled a number of detailed
other criteria
<https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Focus_languages/Requirements>
we would like to use to guide us in our decision, but this assumes that
there are communities to choose from. None of these criteria are set in
stone, and we are happy to discuss them, remove some if they are not good
ideas, or add others if we missed something. Regard this as a strawdog
proposal. For example, Mahir Morshed
<http://meta.wikimedia.org/wiki/User:Mahir256> came up with a complementary
set of criteria on Phabricator
<https://phabricator.wikimedia.org/T274373#6821602>, which we will consider
in the selection as well. We will have Q&A office hours for discussion, and
are open to comments via wiki
<https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_data/Focus_languages>
or email.
We are thinking of a two-pronged approach:
-
first, to call for communities to propose themselves to work with us;
-
second, to look at the data and see which languages would be good
candidates.
We don’t want to set too strict a process. We would like the second prong
of the approach to go on throughout the whole process to help us come to a
good understanding of the options.
For the first prong, we would like the candidate seed groups to describe
and nominate themselves on wiki, following a short form
<https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Focus_languages/Form>.
Nominations should be submitted by April 7, and the decision will be made
by April 14 by the teams taking your comments into account. If we notice
that self-nominations are not happening, we will try to engage with
language communities directly.
It is possible that the two teams will choose different candidates,
although we will try to avoid that.
We are looking forward to hearing about what you think of this proposal.
Please comment on the talk page on wiki
<https://www.wikidata.org/wiki/Wikidata_talk:Lexicographical_data/Focus_languages>
.
Lydia and Denny