On Thu, May 8, 2008 at 11:34 AM, DanTMan <dan_the_man(a)telus.net> wrote:
`table` is already handled before the test for
tablename stuff.
We're just trying to see if anything that is part of a SQL query shows
up inside. So completeness isn't what's being tested.
Doing it properly and parsing it is the only thing that will give
correct results in pathological cases.
You noted the use of () next to ON, rather than just
whitespace. Any
similar types of characters?
Sure. You could have comments, you could probably have all sorts of
weird stuff. Granted that comments would be pathological, but no more
than a table named "-on-". There may be additional non-pathological
cases.
Or maybe we're overcomplicating this...
/((\S{,64}|`[^`]{,64}`)\.)?(\S{,64}|`[^`]{,64}`)/iS
All we really want is to know if the input is valid. And all we're
trying to catch is:
- table
- database.table
And the variants of them with some amount of backticks.
Better version:
/^\s*([a-z0-9_]+\.|`[^`]+`\.)?([a-z0-9_]+|`[^`]+`)\s*$/i
There's no reason to complicate the regex with explicit maximum
lengths. If additional characters are allowed in unquoted table
names, add those to the [a-z0-9_] class. This should avoid all valid
joins and similar. (It still fails on comments, though! :P )
Yes " is usable if ANSI_QUOTES is enabled, but
that is in no way widely
compatible with MySQL itself. So none of the code should ever output that.
Well, technically the code isn't outputting it, it's just passing it
to the abstraction layer. I figured we should be generous. Notice
that I also put in the MSSQLism [table name]. Not strictly any point,
you're right, but it gives me a warm and fuzzy feeling to permit
standards compliance in what we pass to abstraction layers, at least,
even if MySQL sucks too much to allow it to actually be run. :)