How best to anticipate and plan here for ever more accurate translation between Wikipedia / Wikidata languages with full STEM precision? What's the road map? How might this "Unit Localization" Phabricator RFC https://phabricator.wikimedia.org/T86528 fit into a series of Phabricator RFCs in a longer term plan for great Wikidata translation? Can we further begin to lay out this "road map" at this stage for all of Wikipedia's 358 languages (and anticipate even all 7,943 language entries in Glottolog)? 

Would it be possible to dovetail this with developing Wiktionary with Content Translation as Phabricator RFCs? 

(WUaS which donated CC WUaS to CC Wikidata last autumn would like to help develop such translation and for CC MIT OCW in 7 languages and CC Yale OYC, for example, in addition to MediaWiki Content Translation).

Scott


On Jul 28, 2016 3:27 AM, "Lydia Pintscher" <lydia.pintscher@wikimedia.de> wrote:
On Wed, Jul 27, 2016 at 9:18 PM, Stas Malyshev <smalyshev@wikimedia.org> wrote:
> Hi!
>
> Right now, quantities with units are displayed by attaching unit name to
> the number. While it gives the idea of what is going on, it is somewhat
> ungrammatical in English (83 kilgoramm, 185 centimetre, etc.) [1] and in
> other languages - i.e. in Russian it's 83 килограмм, 185 сантиметр -
> instead of the correct "83 килограмма", "185 сантиметров". For some
> units, the norms are kind of tricky and fluid (e.g. see [2]), and they
> are not even identical across all units in the same language, but the
> common theme is that there are grammatical rules on how to do it and
> we're ignoring them right now.
>
> I think we do have some means to grammatically display numbers - for
> example, number of references is displayed correctly in English and
> Russian. As I understand, it is done by using certain formats in message
> strings, and these formats are supported in the code in Language
> classes. So, I wonder if we should maybe have an (optional) property
> that defines the same format for units? We could then reuse the same
> code to display units in proper grammatical way.
>
> Alternatively, we could use short units display [3] - i.e. cm instead of
> centimetre - and then plurals are not required. However, this relies on
> units having short names, and for some units short names can be rather
> obscure, and maybe in some language short names need grammatical forms
> too. Given that we do not link unit names, it would be rather confusing
> (btw, why don't we?). Some units may not have short forms at all.
>
> And the short names do not exactly match the languages - rather, they
> usually match the script (i.e. Cyrillic, or Latin, or Hebrew) - and we
> may not even have data on which language uses which script, in a useful
> form. So using short forms is very tricky.
>
> Any other ideas on this topic? Do we have a ticket tracking this
> somewhere? I looked but couldn't find it.
>
> [1]
> http://english.stackexchange.com/questions/22082/are-units-in-english-singular-or-plural
> [2]
> https://ru.wikipedia.org/wiki/%D0%9E%D0%B1%D1%81%D1%83%D0%B6%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5_%D0%92%D0%B8%D0%BA%D0%B8%D0%BF%D0%B5%D0%B4%D0%B8%D0%B8:%D0%9E%D1%84%D0%BE%D1%80%D0%BC%D0%BB%D0%B5%D0%BD%D0%B8%D0%B5_%D1%81%D1%82%D0%B0%D1%82%D0%B5%D0%B9#.D0.A1.D0.BA.D0.BB.D0.BE.D0.BD.D0.B5.D0.BD.D0.B8.D0.B5_.D0.B5.D0.B4.D0.B8.D0.BD.D0.B8.D1.86_.D0.B8.D0.B7.D0.BC.D0.B5.D1.80.D0.B5.D0.BD.D0.B8.D1.8F
> [3] https://phabricator.wikimedia.org/T86528

The discussion about how to do this is happening in
https://phabricator.wikimedia.org/T86528 The basic problem is that we
do use items for the units. I think this is the right thing to do but
it does make this particular part a bit tricky.


Cheers
Lydia

--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata