On Dec 6, 2004, at 11:11 PM, Nick Triantos wrote:
I've seen a few threads go by about not being able to use "funny" characters in page names... Everything from non-US characters, to the plus sign ( + ) and apostrophe ( ' ).
Is there a reason that the page titles aren't just stored in the database in urlencode( )'d page titles? When a search, etc. is performed, we could just convert the string to its urlencode( ) equivalent, and do the same for [[ ]] type links.
The storage of titles in the database is *in no way at all* related to limitations of what characters we allow in titles. Storing them in the database urlencode()d would make absolutely no difference except to complicate the code and bloat the database.
What characters we allow in titles is *only* related to the wiki link syntax and HTML and URL encoding/decoding issues.
It's extremely helpful for titles to be idempotent for URL decoding; data may be decoded multiple times for instance due to mod_rewrite hell, and for historical reason we in some places allow URL-encoded text to be used in wiki links themselves. Interwiki links at least let %xx hex does pass through unmolested.
(It's pretty annoying to me that you can't name a page "C++") :-)
The + issue has to do with multiple encoding/decoding issues and backwards-compatible links (in URL encoding, + represents a space).
-- brion vibber (brion @ pobox.com)