I tried to see how the ISO codes and IANA language subtags compare with Glottolog's 8,444 entries under languages ( http://glottolog.org/glottolog/language) and Ethnologue's 7,099 living languages (https://www.ethnologue.com/), but couldn't find any comparisons or comparative lists.
Will it be possible with these new developments in Wikidata to query for these possibilities, and leave the options open for a growing list of languages, as well as an universal translator?
And how will invented languages be added, such as Krell, Elvish and Klingon (and even other species' languages in emergent interspecies' communications), and possibly per OpenNMT (Neural Machine Translation) - http://opennmt.net/ (and possibly GNMT); see also Peter Norvig's recent article in the regards to OpenNMT and invented languages - https://medium.com/@peternorvig/last-tweets-of-the-krell-82b8cb74c320 (and per http://scott-macleod.blogspot.com/2017/04/falco-peregrinus-smartphone-that-c... ).
Scott
On Fri, Apr 7, 2017 at 10:13 AM, Daniel Kinzler <daniel.kinzler@wikimedia.de
wrote:
Am 07.04.2017 um 01:34 schrieb Denny Vrandečić:
I foresee that might be a bit of a problem for external tools
consuming
this data - how they would figure out what language it is if it's doesn't have a code? We could of course generate fake codes like mis-x-q12345, maybe that would work.
Q-items for languages already have a property to state their language
code. It's
just an extra hop away.
We want ISO codes (or rather, IANA language subtags [1]), so we can use them in HTML lang attributes, and in RDF literals. This allows interoperability with standard tools.
For this reason, I also favor a mixed approach, that allows standard language tags to be used whenever possible. I have some ideas on how that could work, but no definite plan yet.
Something like de+Q1980305 could work; when generating HTML or RDF, we'd just drop the suffix. For transligual entries (e.g. the for number symbol i), we could use e.g. mis+Q1140046.
[1] https://www.iana.org/assignments/language-subtag-registry/language-subtag- registry
-- Daniel Kinzler Principal Platform Engineer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata