On 1/7/09 2:44 PM, Ilmari Karonen wrote:
Brion Vibber wrote:
The 'maxlength' on the input field is
enforced in Unicode *characters*
by the browser.
I developed a JavaScript workaround some time ago to limit edit
summaries to exactly 250 bytes. It's pretty hacky, but works:
http://en.wikipedia.org/wiki/User:Ilmari_Karonen/longeditsummary.js
Hmm... offhand, I'm not sure how reliably this would interact with
things like cut-and-paste. If in one paste operation we try to replace
80 hanzi characters with 200 ASCII characters, what happens?
I'd been thinking about adding something like it
to MediaWiki itself,
but kind of forgot about it. While doing so, we should also modify the
edit UI so that, instead of silently truncating overlong summaries, it
returns the user to the edit form with a warning message and the
truncated summary, allowing the user to edit it before saving. That
would let us safely raise the default maxlength to 255 even for users
without JavaScript.
Could be done, though that complicates the form logic of course.
I suspect that there's not a huge reason to be using VARCHARs/TINYBLOBs
for summary fields to begin with. They're not used for sorting or
anything, and those that are TINYBLOBs could just be made regular ol'
BLOBs without any performance degredation or extra storage usage AFAIK
unless the actual values are bigger than 255 bytes.
The main area where we _need_ to enforce these funny length limits is on
page titles and usernames, say for renaming pages or creating accounts.
Right now I think we just spit back an "invalid title" error or some
such if you do submit an overlong title.
Of course, if MySQL properly supported Unicode to begin with we'd
probably just be using fixed-length Unicode fields for the titles and
usernames and such where it does make sense to have a clean limit, and
the 'maxlength' attribute would work fine. Ah well. :)
Of course, the same feature could and should be
deployed for other
length-limited summary fields, such as move summaries. (And BTW, why
the heck is the UI element for those a textarea, anyway?)
Because the original single-line input field was frequently confused
with the target field by impatient typers, leading to unexpected results.
-- brion