Rotem Liss <mail@...> writes:
As for the problem in the Simple English Wikipedia, I think its $wgLanguageCode should be changed to "en", because it uses the English messages.
Shinjiman wrote:
Brion Vibber <brion <at> ...> writes:
Shinjiman wrote:
The issue has been mentioned at
http://mail.wikipedia.org/pipermail/wikitech-
l/2006-March/034397.html in wikitecl-l, and also in BugZilla:5790 http://bugzilla.wikimedia.org/show_bug.cgi?id=5790.
Hi all,
As I've mention that pages above, some Wiki sites has the incorrect lang
tags
which neither assiciated with ISO-693 nor IANA language tags. So I've
examined
this issue, and have a patch sibmitted to BugZilla. However the patch
I've
made cannot be accepted as Brian said that this would break up the
caching
system.
There was not any such patch for this, and it would be totally unnecessary
to
make one. Just let us know what the incorrect ones are and we'll fix the configuration.
Are you maybe thinking of the patch for something totally different which
tries
to guess the visitor's language variant and change the lang attribute
based
on it?
For Example: For Simple English Wikipedia, it's show the lang tag as "simple" which do
not
exists neither ISO639 nor IANA language tag tag. For 'Simple English', the lang code should be "en" (English).
And for Traditional Chinese readers reads a Traditional Chinese webpage,
it's
supposed to read a page using the Traditional Chinese font. However this
does
not apply to Wikipedia (and various wiki site running MediaWiki). As the Chinese Wikipedia (and various Chinese based wiki sites that running MediaWiki) has been introduced the LanguageConverter class. The lang tag is "zh" (Chinese). It's not the problem for the Simplified Chinese readers while the (major) browsers will using the Simplified Chinese fonts.
However,
it's having a problem for Traditional Chinese readers to reading the
Chinese
context using a Simplified Chinese font.
As your opinion mentioned, to use the language variant to determine the language code that the user is using, it's quite impossible to determine
the
lang tag by languge variant. Since currently many users in Chinese
Wikipedia
sets to disable the language variant by default because the Chinese words conversion cause much problems currently have. (This is the main point of
the
issue) Including me, I'am also using a Traditional Chinese (UI language =
zh-
hk) interface language and _without enable_ any of language variants
(Variant
= zh). So as the patch I've submitted, it's not to determine the lang tag
only
by the language variant, but checks with both interface language and $wgContLanguageCode (Global interface language).
So summarising my statement above, I've suggested to adding a new attribute
to
assign the lang tag correctly, by using arrays, or something like to
provides
a similar functionally. For example:
Language code | Language tag -------------------------+---------------------------------------------- en | en de | de simple (Simple English) | en zh-cn | zh-cn <= originally supposed to be zh-hans (R1) zh-sg | zh-sg <= originally supposed to be zh-hans (R1) zh-tw | zh-hant zh-hk | zh-hant zh-mo | zh-hant zh-min-nan | zh-hant <= (R2) zh-yue | zh-hant <= (R2)
*Remarks:
- The tags is used as zh-cn/zh-sg instead of zh-hans for browsers
compatibility (likely IE6 will misunderstand the zh-hans lang tag). 2. The tags is used as zh-hant instead of zh-min-nan/zh-yue browsers compatibility (likely both IE/Firefox will misunderstand both zh-min-nan
and
zh-yue lang tag).
For that table about is about to construct a lang tag mapping against
various
languages.
And after this kind of language mapping is done, it's need to modify the OutputPage.php (for older skins) and Monobook.php (for newer skins) to
output
the <html xml:lang"XXX" lang="XXX"> correctly ("XXX" is the correct lang
code
instead using the $wgContLanguageCode directly) to address this issue.
Hope my information I've proveded would help you to ongoing and addressing this kind of issue more smoothly. :)
regards Shinjiman
Wikitech-l mailing list Wikitech-l@... http://mail.wikipedia.org/mailman/listinfo/wikitech-l
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="XXX" lang="XXX">
The lang (and xml:lang) attribute defined at the HTML tag in some language is not correct and it's supposed to not making this value identical to $wgContLanguageCode. For example there's no such language tag called "simple", according to ISO639, RFC1766, RFC3066 (R1,R2). Hence for my previous patch that submitted to Bug:5790. The main purpose of the patch is adding a new Language Tag Mapping against the user interface language which using the incorrect language tag. And change the getting method obtaining the value "XXX", which is not supposed to be $wgContLanguageCode. I think Brion may not fully-understand the actual situation in some language wikis. However, a resolution regarding to this issue is considerable.
An alternative way to solving this issue, can be done by adding a new text field which can make the lang attribute in the HTML tag customisable. So any logged on users can change the value per user's perferences. (this idea is originally submitted by 百楽兎 [http://zh.wikipedia.org/wiki/User:%CE%A0rate])
References ========== R1: W3C, Language information and text direction [http://www.w3.org/TR/html4/struct/dirlang.html] R2: W3C, Language tags in HTML and XML [http://www.w3.org/International/articles/language-tags/]