Langcom November 2013

langcom@lists.wikimedia.org

8 participants
10 discussions

Request for joining the Language Committee
by Chi Hong Lee 20 Apr '15

20 Apr '15

Dear ALL, I am Gabriel Lee from Hong Kong, I would like to be a member in the Langcom. I know zh-n, zh-yue-5, en-5, ja-2 and fr-2. Please contact me with sending a mail to chihonglee777(a)gmail.com. Thanks a lot! Regards, Gabriel Chi Hong Lee

2 1

Konkani
by Gerard Meijssen 09 Jan '14

09 Jan '14

Hoi, I just read this [1]. They will probably want to include this in their Wikipedia and I am sure that it is better to do this in a Wikipedia than it is to add it in the Incubator.. My proposal is to allow them a Wikipedia if we can get some assurances that they will do asap the work needed for localisation.. Hope you concur. Thanks, Gerard [1] http://blog.wikimedia.org/2013/11/26/konkani-vishkawosh-free-license/

3 5

Changing the requirement for a Wikipedia
by Gerard Meijssen 23 Nov '13

23 Nov '13

Hoi, You may have missed it in the mail I send yesterday, but I want to propose to change the requirements for a new Wikipedia. My proposal is to ask for the current number of full articles or provide 50 full articles and the complete labelling of a selected set of 250 Wikidata items. The rationale is that it takes much less effort to work on Wikidata and it will quickly amount to providing additional results for other items. An example: the property "sex" should be in use on all humans and the qualifiers "male" and "female" should be in use on all of these as well. Given that we can provide results from Wikidata in a search on Wikipedia and given that we can provide visualisation for humans and organisms already, the arguments in favour of this move at this time are quite strong. Add to this that through Wikidata we can connect to Commons and we have all the attributes why this could quickly have a big impact for the usability of our projects for a new language. When it proves that we are in favour of this move, I expect it will require confirmation from the board. It will also mean that WMF has to consider some changes to the official search routines. (I expect this will NOT be a problem). Thanks, GerardM

3 3

Approval of Votic Wikipedia
by MF-Warburg 23 Nov '13

23 Nov '13

Hi all, I propose to approve Votic Wikipedia (vot). Meta request: https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Votic Test wiki: https://incubator.wikimedia.org/wiki/Wp/vot Activity: http://toolserver.org/~pathoschild/catanalysis/?cat=0&title=Wp/vot&wiki=inc… at least 3 users with >10 edits since 5 months, and also previous such activity in 2012). Translation of the most-used messages is complete (< http://toolserver.org/~robin/?tool=codelookup&code=vot>). As this would be the first project in Votic, we also would need an expert to verify the content. (By the way, verification for Livonian is still pending - my attempts to contact people were unsuccesful). Best regards, MF-Warburg

6 14

Fwd: Support for small languages
by Gerard Meijssen 18 Nov '13

18 Nov '13

Hoi, I have been asked by Erik what can be done to better support small languages and in particular what we can do to support more small languages more effectively. I have thought about it for a long time and as far as I am concerned, try as we might it will not happen as long as there is no clear benefit we bring. In this text I describe how we can provide more value to people of *any *language. For new and small languages the emphasis will be on bold and easy strokes that have a big impact. Key in all this will be that we have to connect to what is already there. This makes search key. Two of the most important objectives are finding pictures and finding information. Wikidata provides the most obvious tool in this because it takes little effort to connect to the information that is already there. Half an hour a day on labelling items that are in the news will swell the most often searched terms rapidly in any language. When people search for something, they either find it or they do not. When a search is entered and nothing is found, it may exist either under a different spelling or in a different language. When something is NOT found, we should ask if the person knows a synonym in his language or a translation in another language. With the new term we iterate in the search. When something is found after one iteration, we ask if this item is indeed what was intended to be found. One image and a first paragraph of text should suffice. When it does, we add the search item as a (dirty) label. Adding labels in this way will quickly swell the number of terms available in a language for a search. Most importantly we make from a failure a success. A success that benefits everyone who seeks the same information. When an item is found in a language, we can provide information in that language in the format of an infobox or a reasonator page. Obviously many statements may not exist in that language. They are blinking or presented in another language or whatever so that they can be added in the primary language. This approach will ensure that a teacher can select the search terms he is interested in and prepare the information for his students. Another approach is to learn where we fail to provide information. We do not know what search terms fail most often. Consequently we do not have the tool to remedy this in any language. The basis of data driven user participation is that we KNOW what to ask for and why. When people start to find pictures because of the link Wikidata has with Commons, we need to understand it and see it coming before kids in school from all over the world really start hammering our servers. The objective is to reach the tipping point where we become useful in a language. I have been asked to become an advisory board member for the PanLex Project<http://panlex.org>of The Long Now Foundation. I have accepted this and what they are interested in is experimenting with one language and see how their content can make a difference in Wikidata but equally how Wikidata can make a difference in Wikidata. My take on their objective is that their work makes no difference if it is not used. An experiment will see their staff work on leveraging our data and software and vice versa. In my opinion this will make information useful as explained above.' We have the opportunity to experiment with the Long Now Foundation and at the same time develop tooling that will help all our languages and will help us reach the tipping point where Wikidata is useful for all of them. I also propose to change the criteria for accepting new WMF projects. So far we asked for Wikipedia many written articles of high quality. Effectively we accepted many articles of a stub quality. What I propose is to have something like 50 articles of a substantial size and complement this with 250 items that have labels for all the statements. These 250 items cover many domains but are optimised for being what people are likely to search. I have been pushing and experimenting along these lines. The result is a search tool using Wikidata in Wikipedia. A demonstration that Wikidata knows more items than Wikipedia has articles. Visualisation for people and organisms in the "Reasonator" and a personal conviction that increasingly says that this is how we can grow any language to the fullest of its potential. My question is, what do you think. How can we be implement this. What more can we do. Thanks, Gerard PS I fear that when children find that they can find pictures in THEIR language, that they will be able to bring our servers down.. A luxury problem I am sure :) PS-2 A big thank you to Magnus Manske and Lydia Pintscher for the wonderful work they do.

1 0

Ottoman Turkish
by Gerard Meijssen 13 Nov '13

13 Nov '13

Hoi, I want to ask you for eligibility for Ottoman Turkish (ota) for use in Wikidata. Thanks, Gerard

5 9

Support for languages in Wikidata
by Gerard Meijssen 12 Nov '13

12 Nov '13

Hoi Jan-Bart When a new language is deemed "eligible" we, the language committee, inform the chair of the board because he or she has a week to indicate that the board does not agree with our recommendation. When the chair of the board does not reply a bug in bugzilla will be asking for the new language. At this time I ask formal permission for Ottoman Turkish to be *exclusively*used for Wikidata and its applications. Ottoman Turkish is a historic language but with information in this language we will be able to include the original name of something in an info-box in any Wikipedia. Jan-Bart we are likely to make requests like this for many living languages in the near future as well because it is expected that in 6 months time development starts of integrating Wikidata to Commons. This will make it possible to optimally search for pictures in the native language of the children of this world. To a large extend this is already possible for the "big"languages like Dutch. (It only requires a relatively easy hack for this to happen.... Interested? ) Thanks, Gerard

2 1

Language eligibility and Wikidata
by Gerard Meijssen 04 Nov '13

04 Nov '13

Hoi, In the language committee we have agreed that any iso-639-3 language may be eligible for adding labels in Wikidata. This will open up Wikidata as a source in its own right and it will allow for searching images and Wikipedia articles in other languages. There are technical considerations that need to be sorted. Particularly the method whereby languages are enabled within the Wikimedia Foundation. I have created a concept [1] for a blogpost announcing this decision.It has already been read by Amir so it can be safely read on Meta. One of the reasons for this is an old request by Erik to look into making it easier for languages to find their way into WMF. It is likely that one minority language will get its data into Wikidata when this policy is enacted. Thanks, Gerard [1] https://meta.wikimedia.org/wiki/Wikimedia_Blog/Drafts/Any_anguage_allowed_in

3 4

Northern Tujia
by MF-Warburg 03 Nov '13

03 Nov '13

This seems eligible to me, < https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Northe… >. To note that there are two dialects, both with an ISO code, but this one (Northern, tji) is spoken by the vast majority of speakers. So that also seems reasonable that they start requesting with this one.

3 2

Wiktionary Klingon
by MF-Warburg 03 Nov '13

03 Nov '13

I propose to reject this. https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wiktionary_Kling…

6 6

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Langcom November 2013