Hoi,

My point is very much that we can do things NOW. There is no need to point to things that you would like when there is so much that CAN be done.

The WMF IS working on translation functionality. That may help under some circumstances but it is unlikely to help the languages that do not have a Wikipedia yet. It is not likely to help the majority of those 285 languages either.

Functionality that is probably available for many languages is transliteration This is where for instance a name is to be made available in another script. When such routines are available, we can transliterate the names of the humans in Wikidata and other types of data where we can safely assume that the transliteration is valid as a label for a languages...

What we need to decide upon is what we can do now. India has so many people that there must be at least one person who can hear this message and understand that we can build relevant information in most if not all languages of India. When we advertise the existence of relevant information in a language, more people will have a look and some will take an interest.. This is how you get to the tipping point where some people start using their language in a digital way. Using lexical data to add to Wikidata is one way, one step.. it starts with the realisation that we can make a difference now.

We can do this, we can do this now.
Thanks,
      Gerard


On 6 January 2014 02:14, sankarshan <foss.mailinglists@gmail.com> wrote:
On Sun, Jan 5, 2014 at 11:45 PM, Gerard Meijssen
<gerard.meijssen@gmail.com> wrote:
> I totally agree that there are translation dictionaries for many languages.
> However, putting such content to work IS a big issue.Typically such
> dictionaries are only available as a dead wood publication. People either
> have one or don't and the only thing they are good for is finding the
> corresponding word in the other language anyway.

That is a fair and an extremely pertinent point. However, dictionaries
(if we are using the same meaning for the word!) are by themselves
merely lists of words. A lack of a freely available dictionary or, a
list of translated terms is not a complete blocker to content. A lack
of a strong spell checker might be. But, I wouldn't put such emphasis
on an organized list like a dictionary.

> My point is that such content  be put to work.

And, my take is that what is required is more focused and stronger
investment over repeated cycles into automated translation systems,
especially perhaps Statistical-MT.

> Yes, I totally agree that there are issues to use many languages on the
> Internet. However, the WMF has in its tooling the ability to bring you
> webfonts and input methods for many/most languages. When we get to work with
> publishers / enthusiasts for specific languages we CAN add these to the
> existing languages. As the WMF toolkit can be used on Chrome and Firefox
> browsers, it means that this toolkit is very much avaiclaanble for more
> languages.

Webfonts and, Input methods are aids - they seed first round of
content generation and, continue to impact generation of content over
a period. For example, ULS has made it possible for content creators
of websites to stop worrying about having to limit themselves to a
couple of languages. But, what's beyond ULS? Surely there is something
that needs to be improved. Where's the plan for the "WMF Toolkit"?

An analogy would be chalk and, blackboard. However, writing materials
are not limited *only to* chalk and blackboard or, quills and
parchment. There is a need to push sustained and well thought out
efforts into creation of content that is relevant, available and,
especially well curated. Unfortunately, WMF has not done much
discussion in public about content translation pieces except for a
thread of ideas initiated by Erik. And, I have not read any plan of
action that talks about how to even think about doing this and, making
it available. Then again, the content of the world is not limited to
Wikipedia. Content is being created/written everywhere (even this
discussion is content perhaps worth having in multiple languages)

> So yes, there are more problems but for many if not most languages content
> is king if we want to bring more languages to the Internet. And, yes we can
> when we put our lexical content to it.

I am not disagreeing with your position. I am merely citing that there
needs to be augmentation to the basic plumbing of languages that is
already existing and, there needs to be some thought about what the
next step of efforts would be.


--
sankarshan mukhopadhyay
<https://twitter.com/#!/sankarshan>

_______________________________________________
Wikimediaindia-l mailing list
Wikimediaindia-l@lists.wikimedia.org
To unsubscribe from the list / change mailing preferences visit https://lists.wikimedia.org/mailman/listinfo/wikimediaindia-l