Hi,
I have a vague recollection that we require Unicode support to create a new language, but I cannot find it in the policy. Do we indeed require this explicitly, or am I just making things up?
Or is it just a de facto practicality—that it's technically difficult to host a language in a script that isn't supposed in Unicode?
Hoi, In the past the WMF has funded the creation of a Unicode font. Having a Unicode font is essential when we are to support it in MediaWiki. Not having a fully developed font is what hinders the necessary follow up of projects in SignWriting ie all the signed languages.
I do agree that a language with a default script not supported in Unicode is hugely problematic. Thanks, GerardM
On Sun, 14 Apr 2019 at 18:36, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Hi,
I have a vague recollection that we require Unicode support to create a new language, but I cannot find it in the policy. Do we indeed require this explicitly, or am I just making things up?
Or is it just a de facto practicality—that it's technically difficult to host a language in a script that isn't supposed in Unicode?
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
I'm not talking a missing font, even though this would be a problem, too. I'm talking about scripts that aren't encoded at all, for example the Zaghawa alphabet.
And I understand, of course, that it's technically problematic. What I'm asking is whether there's an explicit written Language committee policy about it, or is it just a de facto practical matter.
בתאריך יום א׳, 14 באפר׳ 2019, 19:41, מאת Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, In the past the WMF has funded the creation of a Unicode font. Having a Unicode font is essential when we are to support it in MediaWiki. Not having a fully developed font is what hinders the necessary follow up of projects in SignWriting ie all the signed languages.
I do agree that a language with a default script not supported in Unicode is hugely problematic. Thanks, GerardM
On Sun, 14 Apr 2019 at 18:36, Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
Hi,
I have a vague recollection that we require Unicode support to create a new language, but I cannot find it in the policy. Do we indeed require this explicitly, or am I just making things up?
Or is it just a de facto practicality—that it's technically difficult to host a language in a script that isn't supposed in Unicode?
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Hoi, Unicode has funding drives where they ask the PUBLIC to support one character.. Why not have the WMF fund the missing characters in scripts we need for our projects? Thanks, GerardM
On Sun, 14 Apr 2019 at 18:49, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
I'm not talking a missing font, even though this would be a problem, too. I'm talking about scripts that aren't encoded at all, for example the Zaghawa alphabet.
And I understand, of course, that it's technically problematic. What I'm asking is whether there's an explicit written Language committee policy about it, or is it just a de facto practical matter.
בתאריך יום א׳, 14 באפר׳ 2019, 19:41, מאת Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, In the past the WMF has funded the creation of a Unicode font. Having a Unicode font is essential when we are to support it in MediaWiki. Not having a fully developed font is what hinders the necessary follow up of projects in SignWriting ie all the signed languages.
I do agree that a language with a default script not supported in Unicode is hugely problematic. Thanks, GerardM
On Sun, 14 Apr 2019 at 18:36, Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
Hi,
I have a vague recollection that we require Unicode support to create a new language, but I cannot find it in the policy. Do we indeed require this explicitly, or am I just making things up?
Or is it just a de facto practicality—that it's technically difficult to host a language in a script that isn't supposed in Unicode?
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
On 14 Apr 2019, at 17:58, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Unicode has funding drives where they ask the PUBLIC to support one character.. Why not have the WMF fund the missing characters in scripts we need for our projects?
I’ve suggested this for years. At the first Hackathon I attended in Berlin, for example.
It’s not as though you guys don’t know some sort of expert.
Michael
For all I know, it is only a de facto matter.
Amir E. Aharoni amir.aharoni@mail.huji.ac.il schrieb am So., 14. Apr. 2019, 18:49:
I'm not talking a missing font, even though this would be a problem, too. I'm talking about scripts that aren't encoded at all, for example the Zaghawa alphabet.
And I understand, of course, that it's technically problematic. What I'm asking is whether there's an explicit written Language committee policy about it, or is it just a de facto practical matter.
בתאריך יום א׳, 14 באפר׳ 2019, 19:41, מאת Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, In the past the WMF has funded the creation of a Unicode font. Having a Unicode font is essential when we are to support it in MediaWiki. Not having a fully developed font is what hinders the necessary follow up of projects in SignWriting ie all the signed languages.
I do agree that a language with a default script not supported in Unicode is hugely problematic. Thanks, GerardM
On Sun, 14 Apr 2019 at 18:36, Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
Hi,
I have a vague recollection that we require Unicode support to create a new language, but I cannot find it in the policy. Do we indeed require this explicitly, or am I just making things up?
Or is it just a de facto practicality—that it's technically difficult to host a language in a script that isn't supposed in Unicode?
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
On 14 Apr 2019, at 18:28, MF-Warburg mfwarburg@googlemail.com wrote:
For all I know, it is only a de facto matter.
It would not be good for a language to have an encyclopaedia in PUA characters, and their browsers would perpetually be substituting CJK characters encoded in that space too.
Michael
Hoi, The WMF does have the technology to provide the font from its servers. It is not the case that characters would not show properly. Thanks, GerardM
On Mon, 15 Apr 2019 at 03:07, Michael Everson everson@evertype.com wrote:
On 14 Apr 2019, at 18:28, MF-Warburg mfwarburg@googlemail.com wrote:
For all I know, it is only a de facto matter.
It would not be good for a language to have an encyclopaedia in PUA characters, and their browsers would perpetually be substituting CJK characters encoded in that space too.
Michael _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
On 15 Apr 2019, at 06:28, Gerard Meijssen gerard.meijssen@gmail.com wrote:
The WMF does have the technology to provide the font from its servers. It is not the case that characters would not show properly.
If something is worth doing, it is worth doing right. If a language is worthy of a Wikipedia, its characters are worth encoding.
I will vehemently oppose any Wikipedia which has to use the PUA to display text.
Michael Everson
Gerard Meijssen gerard.meijssen@gmail.com 於 2019年4月15日週一 下午1:28寫道:
Hoi, The WMF does have the technology to provide the font from its servers. It is not the case that characters would not show properly. Thanks, GerardM
How? The topic have been discussed on Chinese Wikipedia for months last year (See: https://zh.wikipedia.org/wiki/Template_talk:%E7%BC%BA%E5%AD%97 ) yet there are still no implementable solution to solve the problem of displaying Chinese characters that are either unencoded by Unicode yet or newly encoded but without sufficient font support yet
Michael Everson everson@evertype.com 於 2019年4月16日週二 上午2:08寫道:
On 15 Apr 2019, at 06:28, Gerard Meijssen gerard.meijssen@gmail.com wrote:
The WMF does have the technology to provide the font from its servers. It is not the case that characters would not show properly.
If something is worth doing, it is worth doing right. If a language is worthy of a Wikipedia, its characters are worth encoding.
I will vehemently oppose any Wikipedia which has to use the PUA to display text.
Michael Everson
Unicode's priority is not the same as Wikipedia/WMF. For example, there are quite a few Chinese characters that are used in various different Chinese Wikipedia article but Unicode either haven't encoded them or they have already refused to encode those characters based on various reasons.
On 14 Apr 2019, at 17:49, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
I'm not talking a missing font, even though this would be a problem, too. I'm talking about scripts that aren't encoded at all, for example the Zaghawa alphabet.
We call that Beria. It’s not encoded. If WM needs it encoded, WM should help make that happen.
And I understand, of course, that it's technically problematic. What I'm asking is whether there's an explicit written Language committee policy about it, or is it just a de facto practical matter.
A Beria Wiki would use PUA characters and all of that data would have to be transcoded later.
Michael
On 14 Apr 2019, at 17:41, Gerard Meijssen gerard.meijssen@gmail.com wrote:
In the past the WMF has funded the creation of a Unicode font.
When?
Having a Unicode font is essential when we are to support it in MediaWiki. Not having a fully developed font is what hinders the necessary follow up of projects in SignWriting ie all the signed languages.
Actually there’s a reasonable font out there, though there’s lots of difficulties. I worked on the code chart font for SignWriting.
I do agree that a language with a default script not supported in Unicode is hugely problematic.
It could only be Private Use Characters. If a script exists, it should be encoded.
Encoding isn’t free.
Michael
On 14 Apr 2019, at 17:35, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Or is it just a de facto practicality—that it's technically difficult to host a language in a script that isn't supposed in Unicode?
It is impossible to do so reliably and interchangeably.
Michael