I'm trying to render an image which uses characters from all of the languages supported by WP. Is there a single font deployed on production servers that include all scripts? Any simple font would do, preferably TTF arial-style. Thanks!
No such font exists. You can try DejaVu Sans or FreeSerif for best coverage.
Nemo
Thanks Federico, I used /usr/share/fonts/truetype/ttf-dejavu/DejaVuSans.ttf but didn't see FreeSerif. DejaVuSans doesn't seem to render Hindi. Is there a font for that?
On Wed, Jul 30, 2014 at 2:12 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
No such font exists. You can try DejaVu Sans or FreeSerif for best coverage.
Nemo
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Jul 29, 2014 at 3:46 PM, Yuri Astrakhan yastrakhan@wikimedia.org wrote:
Thanks Federico, I used /usr/share/fonts/truetype/ttf-dejavu/DejaVuSans.ttf but didn't see FreeSerif. DejaVuSans doesn't seem to render Hindi. Is there a font for that?
On Wed, Jul 30, 2014 at 2:12 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
No such font exists. You can try DejaVu Sans or FreeSerif for best coverage.
Nemo
For Devanagari*,* my system is using Gargi https://fedoraproject.org/wiki/Gargi_fonts
For a slightly more complete list: If I go to https://meta.wikimedia.org/wiki/List_of_Wikipedias and use web-inspector to list the fonts used on the page, in Firefox Ubuntu, with a few non-stock fonts installed, I get:
TakaoPGothic Lohit Bengali Zawgyi-One Meera Pothana2000 Georgia NanumGothic Ubuntu DejaVu Sans Droid Sans Mono Kedage Normal gargi Lohit Tamil Rekha Droid Sans Fallback Free Serif ori1Uni Medium Free Sans Khmer OS Waree DejaVu Sans Saab mry_KacstQurn jomolhari brahmi Nuosu
And that results in all language names rendered correctly. There's almost certainly overlap, so I'm not sure what the minimum set of required fonts would be. Possibly ask https://lists.wikimedia.org/mailman/listinfo/languages or https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
HTH. Quiddity
FreeSerif and even FreeSans have Devanagari for Hindi, but lack other things. There are helpful coverage tables; a script exists that could be adapted to produce some for other fonts. https://bugzilla.wikimedia.org/show_bug.cgi?id=59983#c5
Nemo
On 7/30/14, 12:12 AM, Federico Leva (Nemo) wrote:
No such font exists. You can try DejaVu Sans or FreeSerif for best coverage.
There's also a newish font from Google that has quite wide coverage: https://code.google.com/p/noto/
In general, "one font to rule them all" is highly discouraged/impractical as a means to achieve reasonable results in a variety of world languages. Indic fonts, for example, typically contain complex shaping engines in bytecode -- it's just not practical to try to write one engine for everything. All of the "one font with wide coverage" attempts that I have seen look okay for Latin languages, but fail for the rest of the world.
<rant>...which is typically the opinion inadvertently expressed by these "characters from every language" projects anyway. Without real knowledge of the rest of the world's languages and scripts, we get something that shows that the creator valued the rest of the world only for "looking exotic", and was not interested in true understanding.</rant>
Most modern font systems have a mechanism to merge multiple fonts under one virtual name as needed in order to get good coverage. So you don't need to find a find font to rule them all. --scott
ps. "Font synthesis" systems actually have a big problem in that parts of the unicode character space are shared by different languages with different rules for shaping and ligatures, etc. So you really need to explicitly annotate the language and then chose a font specific for that *language*, not rely simply on codepoint. (Unfortunately much of the "foreign language" content in wikipedia (ie short texts which are not in the main language of the wiki) is not explicitly annotated with language information.)
pps. for those actually interested in getting the details of world writing systems correct, I could use some help with the new OCG PDF rendering backend, which just went live in production yesterday. It uses XeLaTeX, which actually does pay careful attention to Indic shaping and ligatures, etc, but it is not a "modern system" as described above in terms of synthesizing coverage from multiple fonts. Patches would be helpful to make better guesses about the native language of "foreign language" spans, which would then ensure an appropriate font was used.
I'm trying to render an image which uses characters from all of the languages supported by WP. Is there a single font deployed on production servers that include all scripts?
The Autonym font includes characters for all the languages supported by MediaWiki, but only a small subset: https://www.mediawiki.org/wiki/Universal_Language_Selector/AutonymFont
Ryan Kaldari
On Wed, Jul 30, 2014 at 7:48 AM, C. Scott Ananian cananian@wikimedia.org wrote:
In general, "one font to rule them all" is highly discouraged/impractical as a means to achieve reasonable results in a variety of world languages. Indic fonts, for example, typically contain complex shaping engines in bytecode -- it's just not practical to try to write one engine for everything. All of the "one font with wide coverage" attempts that I have seen look okay for Latin languages, but fail for the rest of the world.
<rant>...which is typically the opinion inadvertently expressed by these "characters from every language" projects anyway. Without real knowledge of the rest of the world's languages and scripts, we get something that shows that the creator valued the rest of the world only for "looking exotic", and was not interested in true understanding.</rant>
Most modern font systems have a mechanism to merge multiple fonts under one virtual name as needed in order to get good coverage. So you don't need to find a find font to rule them all. --scott
ps. "Font synthesis" systems actually have a big problem in that parts of the unicode character space are shared by different languages with different rules for shaping and ligatures, etc. So you really need to explicitly annotate the language and then chose a font specific for that *language*, not rely simply on codepoint. (Unfortunately much of the "foreign language" content in wikipedia (ie short texts which are not in the main language of the wiki) is not explicitly annotated with language information.)
pps. for those actually interested in getting the details of world writing systems correct, I could use some help with the new OCG PDF rendering backend, which just went live in production yesterday. It uses XeLaTeX, which actually does pay careful attention to Indic shaping and ligatures, etc, but it is not a "modern system" as described above in terms of synthesizing coverage from multiple fonts. Patches would be helpful to make better guesses about the native language of "foreign language" spans, which would then ensure an appropriate font was used.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org