Hoi, It is clear that you do not know the language policy. It does not say what project should be the first. It does state what is needed to start with a new language. The requirement for a first project is that there is localisation for the most used messages of MediaWiki and a sustained effort at the Incubator. For Wikidata there is an option to add strings in the language when the language is supported by an ISO-639-3 language code.
Having said that, I do not mind lofty words in a strategy document but when things are to be realistic, it is stupid to add more balast to a fully laden ship. We have plenty of languages in Africa and Asia with millions of people that are functionally dormant. When you consider historic information about Africa and Asia, there is so much information not accessible or even available (often in other languages). Making data available to Wikidata is one thing, I am adding information on the Ottoman Empire from the Catalan Wikipedia that is often more complete, but gaining is a completely different issue. There is too much hostility by Wikipedians to Wikidata.
When you consider experiments like the Cebuan Wikipedia, it has been said by a WMF official that this will not be studied for its effects. When you consider that some 4 to 6% of links in lists like the "George Polk Award Winners" are wrong, you will appreciate that this is where we could consider alternatives that fix these issues. The problem is that there is little interest in such issues. Alternatives to the current linking system is possible and linking to Wikidata has proven itself from the start for interwikilinks (it is a known good).
When we are to support 2000 languages we need to be smart about it. We could but we need to be practical. Agreeing on the quality of information first by comparison is one strategy where we need the help of DBpedia because they at this time have a better framework for comparison. We could use lexical data to generate dynamic texts in languages but imho we start with an artificial gap between lexical data and topical data. I am interested to learn how issues like these will be "solved".
As to your proposal for new articles; it assumes that articles need to be saved. I am more in favour of articles that will be read and change when the underlying data changes. Authors can step in when they feel they can do better. Thanks, GerardM
On 26 March 2018 at 20:25, Leila Zia leila@wikimedia.org wrote:
Hi Gerard,
On Fri, Mar 23, 2018 at 12:13 AM, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi, I have read your comments on the WIki Indaba. Sad to hear that you could not make it.
As a movement it is not our task to serve the "2000" languages that you mention. It is our task to serve the languages that we support in our existing Wikipedias.
This is not obvious to me if I read the strategic direction [1]. Specifically under Knowledge Equity we say:
"We will welcome people from every background to build strong and diverse communities. We will break down the social, political, and technical barriers preventing people from accessing and contributing to free knowledge."
Depending on how we want to operationalize "welcome" in the above sentence, we may not want to focus on Wikipedia as the only project which will be the path of entry for language communities. Even if it's clear that we have to focus on Wikipedia, it is not clear to me that we should focus our support only on the languages that already have a Wikipedia. What if there are languages in which Wikipedia can be present and due to the limitations of the specific community around that language they have not been able to pull off their language Wikipedia? Of course, I understand the tension. There is argument to be made that when it comes to Wikipedia, our best bet is to focus on the languages that are already in. That's why I called out that we will be challenged with the trade-offs.
Where you talk about subjects that people are likely to read, there are many predictive models possible. The big issue in current approaches is that they start with what we know from projects particularly the English Wikipedia. The English Wikipedia is biased and consequently many subjects that may be of a higher relevance in other languages or cultures will not be suggested when English Wikipedia and its traffic is the yard stone to measure by.
The ranking model in section 2.2. of https://arxiv.org/pdf/1604.03235.pdf addresses this issue to a good extent. There is no emphasis on one Wikipedia in that model. Please check the list of features. We still can do better and improve that model to not be based on the pageviews in the destination language, as I mentioned in the report, we've had some conversations about picking up that direction, but the reality is that we have a working model that can predict pageviews in the destination language based on more universal features than just what is happening in English Wikipedia. We should use that model when relevant! :)
Anyway, thank you for reporting on your virtual presence; you made a difference in this way.
anytime! :)
Best, Leila
[1] https://meta.wikimedia.org/wiki/Strategy/Wikimedia_ movement/2017/Direction#Our_strategic_direction:_Service_and_Equity
Thanks, GerardM
On 23 March 2018 at 00:41, Leila Zia leila@wikimedia.org wrote:
Hi all,
Here is the report of the one session I attended in Wiki Indaba over
the
past weekend: https://meta.wikimedia.org/wiki/User:LZia_(WMF)/Trip_ reports#Wiki_Indaba_2018
Best, Leila _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l