Petr Kadlec wrote:
I believe that a correct solution (apart from the
long-term solution
of using UTF-8 everywhere) could be:
* Accept UTF-8 in URLs on en: (but how could they be recognized??)
* Interwiki linking should use UTF-8 even on en: (or, does another
Wikipedia except en: use latin-1?)
Last question: yes there are a few. For those wiki's that are now UTF-8
but were ISO-8859-1 before, the 8859-1 %XX code is also still accepted.
Indeed it is better to avoid the %XX codes in interwiki links. A
reasonably good alternative is formed by using &; and &#; entities.
Those are independent of the encoding. The pywikipediabot will take this
route for all links that can not be expressed natively, and the
interwiki bot will automatically convert all %XX links automatically
upon passing (but only if other updates are needed to the page).
Regards,
Rob Hooft
--
Rob W.W. Hooft || rob(a)hooft.net ||
http://www.hooft.net/people/rob/