On 1/7/09 2:44 PM, Ilmari Karonen wrote:
Brion Vibber wrote:
The 'maxlength' on the input field is enforced in Unicode *characters* by the browser.
I developed a JavaScript workaround some time ago to limit edit summaries to exactly 250 bytes. It's pretty hacky, but works:
http://en.wikipedia.org/wiki/User:Ilmari_Karonen/longeditsummary.js
Hmm... offhand, I'm not sure how reliably this would interact with things like cut-and-paste. If in one paste operation we try to replace 80 hanzi characters with 200 ASCII characters, what happens?
I'd been thinking about adding something like it to MediaWiki itself, but kind of forgot about it. While doing so, we should also modify the edit UI so that, instead of silently truncating overlong summaries, it returns the user to the edit form with a warning message and the truncated summary, allowing the user to edit it before saving. That would let us safely raise the default maxlength to 255 even for users without JavaScript.
Could be done, though that complicates the form logic of course.
I suspect that there's not a huge reason to be using VARCHARs/TINYBLOBs for summary fields to begin with. They're not used for sorting or anything, and those that are TINYBLOBs could just be made regular ol' BLOBs without any performance degredation or extra storage usage AFAIK unless the actual values are bigger than 255 bytes.
The main area where we _need_ to enforce these funny length limits is on page titles and usernames, say for renaming pages or creating accounts. Right now I think we just spit back an "invalid title" error or some such if you do submit an overlong title.
Of course, if MySQL properly supported Unicode to begin with we'd probably just be using fixed-length Unicode fields for the titles and usernames and such where it does make sense to have a clean limit, and the 'maxlength' attribute would work fine. Ah well. :)
Of course, the same feature could and should be deployed for other length-limited summary fields, such as move summaries. (And BTW, why the heck is the UI element for those a textarea, anyway?)
Because the original single-line input field was frequently confused with the target field by impatient typers, leading to unexpected results.
-- brion