Kai F. Lahmann wrote:
In the case of
ISO 8859-1 characters outside the ASCII range (e.g. á, é,
ñ, ç, etc) it actually needs more space, 1 byte versus 2 bytes, I think.
Nevertheless nl: has few of these characters when compared with es: or fr:.
those are 128 characters, which get 1 byte bigger, yes. Also there are many
characters, which get smaller, as the HTML-Entities are 6-8 Byte long, the
same in UTF-8 is only 3-4 Byte. And don't forget links like [[Lodz|Łódź]]: 28
byte before, 14 Byte after (maybe little different..). Same for all
Interwiki-Links to non-latin Wikipedias and even for many east-european WPs.
On DE: (known to have a massive use of ä, ö ü and ß) we had no visible grow
with the conversion.
I really don't know why we are discussing here the space requirements of
UTF-8 versus Latin-1. This is absolutely not a criterion, a problem, an
issue, or a concern.
We want UTF-8 because it allows us to do things that Latin-1 cannot do
(in particular, have article titles with proper characters). Talking
about space requirements is pointless. Something that can do less will
*obviously* require less space for the things it *can* do, but what use
is that if it cannot do what we want.
Timwi