Hoi,
It is clear that you do not know the language policy. It does not say what
project should be the first. It does state what is needed to start with a
new language. The requirement for a first project is that there is
localisation for the most used messages of MediaWiki and a sustained effort
at the Incubator. For Wikidata there is an option to add strings in the
language when the language is supported by an ISO-639-3 language code.
Having said that, I do not mind lofty words in a strategy document but when
things are to be realistic, it is stupid to add more balast to a fully
laden ship. We have plenty of languages in Africa and Asia with millions of
people that are functionally dormant. When you consider historic
information about Africa and Asia, there is so much information not
accessible or even available (often in other languages). Making data
available to Wikidata is one thing, I am adding information on the Ottoman
Empire from the Catalan Wikipedia that is often more complete, but gaining
is a completely different issue. There is too much hostility by Wikipedians
to Wikidata.
When you consider experiments like the Cebuan Wikipedia, it has been said
by a WMF official that this will not be studied for its effects. When you
consider that some 4 to 6% of links in lists like the "George Polk Award
Winners" are wrong, you will appreciate that this is where we could
consider alternatives that fix these issues. The problem is that there is
little interest in such issues. Alternatives to the current linking system
is possible and linking to Wikidata has proven itself from the start for
interwikilinks (it is a known good).
When we are to support 2000 languages we need to be smart about it. We
could but we need to be practical. Agreeing on the quality of information
first by comparison is one strategy where we need the help of DBpedia
because they at this time have a better framework for comparison. We could
use lexical data to generate dynamic texts in languages but imho we start
with an artificial gap between lexical data and topical data. I am
interested to learn how issues like these will be "solved".
As to your proposal for new articles; it assumes that articles need to be
saved. I am more in favour of articles that will be read and change when
the underlying data changes. Authors can step in when they feel they can do
better.
Thanks,
GerardM
On 26 March 2018 at 20:25, Leila Zia <leila(a)wikimedia.org> wrote:
Hi Gerard,
On Fri, Mar 23, 2018 at 12:13 AM, Gerard Meijssen
<gerard.meijssen(a)gmail.com> wrote:
Hoi,
I have read your comments on the WIki Indaba. Sad to hear that you could
not make it.
As a movement it is not our task to serve the "2000" languages that you
mention. It is our task to serve the languages that we support in our
existing Wikipedias.
This is not obvious to me if I read the strategic direction [1].
Specifically under Knowledge Equity we say:
"We will welcome people from every background to build strong and
diverse communities. We will break down the social, political, and
technical barriers preventing people from accessing and contributing
to free knowledge."
Depending on how we want to operationalize "welcome" in the above
sentence, we may not want to focus on Wikipedia as the only project
which will be the path of entry for language communities. Even if it's
clear that we have to focus on Wikipedia, it is not clear to me that
we should focus our support only on the languages that already have a
Wikipedia. What if there are languages in which Wikipedia can be
present and due to the limitations of the specific community around
that language they have not been able to pull off their language
Wikipedia? Of course, I understand the tension. There is argument to
be made that when it comes to Wikipedia, our best bet is to focus on
the languages that are already in. That's why I called out that we
will be challenged with the trade-offs.
Where you talk about subjects that people are
likely to read, there are
many predictive models possible. The big issue in current approaches is
that they start with what we know from projects particularly the English
Wikipedia. The English Wikipedia is biased and consequently many subjects
that may be of a higher relevance in other languages or cultures will not
be suggested when English Wikipedia and its traffic is the yard stone to
measure by.
The ranking model in section 2.2. of
https://arxiv.org/pdf/1604.03235.pdf addresses this issue to a good
extent. There is no emphasis on one Wikipedia in that model. Please
check the list of features.
We still can do better and improve that model to not be based on the
pageviews in the destination language, as I mentioned in the report,
we've had some conversations about picking up that direction, but the
reality is that we have a working model that can predict pageviews in
the destination language based on more universal features than just
what is happening in English Wikipedia. We should use that model when
relevant! :)
Anyway, thank you for reporting on your virtual
presence; you made a
difference in this way.
anytime! :)
Best,
Leila
[1]
https://meta.wikimedia.org/wiki/Strategy/Wikimedia_
movement/2017/Direction#Our_strategic_direction:_Service_and_Equity
Thanks,
GerardM
On 23 March 2018 at 00:41, Leila Zia <leila(a)wikimedia.org> wrote:
> Hi all,
>
> Here is the report of the one session I attended in Wiki Indaba over
the
past
weekend:
https://meta.wikimedia.org/wiki/User:LZia_(WMF)/Trip_
reports#Wiki_Indaba_2018
Best,
Leila
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l