On lun, 2003-01-20 at 18:53, Pieter Suurmond wrote:
That's great Brion Vibber! Now http://www.wikipedia.org/ is indeed 100% valid. The Dutch site still contains some errors but that's mainly due to things like "оÑ?Ñ?иÑ?">Russkiy</a>" on http://nl.wikipedia.org. I'll try to repair those manually...
Another problem is the inclusion of external links with ampersands in them. Strictly speaking, ampersands in links need to be '&' instead of '&' because entity interpretation is done before the tag attributes are; all our generated links take this into account (I think!), but we don't presently provide such conversion for inline external links in a wiki page (in part because we don't know what the author intended...)
It would I think not be unreasonable for the wiki parser to automatically convert &s not followed by an entity body to & ... should we things that do look like entities to go through intact, though? Or just escape all &s? (Upside: simple, consistent behavior. Downside: inconsistent with wikilinks, where entities are allowed.)
I've another whish: could a script on the wikiserver do automatic character-to-entity-conversion like?: à --> à ë --> ë
I'd rather do the other way around, convert input entities to real characters and keep them that way; entities are bandwidth hogs and not really particularly helpful. Text on the Chinese and Japanese wikis for instance would take about 3 times the bandwidth they presently do using numeric entities instead of UTF-8.
If you want to see the names of the entities in the textarea (so as to avoid the editors that damage non-ASCII text), they have to be escaped again with an &, and any text that uses non-trivial amounts of non-ASCII characters becomes illegible. As an option and for known unfriendly user agents it may be helpful, but I'd avoid it if I could.
Well, I'm only suggesting... Anyhow, thanks for fixing the English front page so very quickly!
You're welcome!
-- brion vibber (brion @ pobox.com)