On Tue, Nov 18, 2003 at 04:28:32AM -0500, Daniel Mayer wrote:
Peter Gervai wrote:
Could you point us to the page and revision of the problem?
couple examples:
http://meta.wikipedia.org/w/wiki.phtml?title=What_to_do_with_www.wikipedia.o... http://meta.wikipedia.org/w/wiki.phtml?title=Main_Page&diff=20132&ol...
This happens on meta's Main Page often. Ask Anthere and Erik for other examples.
I see. First is not a good example, Opera 5 is _ancient_, you can't expect that anyone would support it, as upgrading is clearly painless.
Second example is indeed valid, but it isn't a problem for you: if the page does not contain non-8859-1 characters, nothing gets garbled. If it does contain others then, well, you *need* utf-8 on that page anyway. (Embed codes are a little bit slow to type, don't you agree? If not, write your reply manually by using embeds. :))
I'm curious what kind of problem it might have been, as many of the Wikipedias are in UTF-8 from the start, and we had no problem whatsoever.
Probably because their browsers work nicely in UTF-8 because they have to. If they didn't they would be useless for any language where UTF-8 is required. In places where UTF-8 isn't required, browsers that can't support it tend to slip by without being fixed or upgraded. If it ain't broke...
I understand your problem, it is valid, and that's probably the reason it's topic on wikitech. Still I believe we can expect editors to use non-ancient browsers (remember, reading is not a problem). As far as I know most browsers handle this very well (including, for example, unix character mode browsers).
However we *do* have problems with english wikipedia when pages contain unrepresentable literal characters, which makes the page break after editing. See "Budapest" article on wikitravel, where every special dash and curly quote marks became question marks. Truly ugly.
I don't understand. Is Wikitravel in UTF-8?
Not at all! It clearly shows what happens when a page _IS_ 8859-1 encoded but editors want to use fancy characters. Same hppens when they do it with an old browser on UTF-8 pages. So, you get trash either way, other editors revert the same way, so you may well use utf-8, don't you? :-)
Peter