Kai F. Lahmann wrote:
In the case of ISO 8859-1 characters outside the ASCII range (e.g. á, é, ñ, ç, etc) it actually needs more space, 1 byte versus 2 bytes, I think. Nevertheless nl: has few of these characters when compared with es: or fr:.
those are 128 characters, which get 1 byte bigger, yes. Also there are many characters, which get smaller, as the HTML-Entities are 6-8 Byte long, the same in UTF-8 is only 3-4 Byte. And don't forget links like [[Lodz|Łódź]]: 28 byte before, 14 Byte after (maybe little different..). Same for all Interwiki-Links to non-latin Wikipedias and even for many east-european WPs. On DE: (known to have a massive use of ä, ö ü and ß) we had no visible grow with the conversion.
I really don't know why we are discussing here the space requirements of UTF-8 versus Latin-1. This is absolutely not a criterion, a problem, an issue, or a concern.
We want UTF-8 because it allows us to do things that Latin-1 cannot do (in particular, have article titles with proper characters). Talking about space requirements is pointless. Something that can do less will *obviously* require less space for the things it *can* do, but what use is that if it cannot do what we want.
Timwi