Simetrical wrote:
On Thu, May 8, 2008 at 11:34 AM, DanTMan dan_the_man@telus.net wrote:
`table` is already handled before the test for tablename stuff. We're just trying to see if anything that is part of a SQL query shows up inside. So completeness isn't what's being tested.
Doing it properly and parsing it is the only thing that will give correct results in pathological cases.
You noted the use of () next to ON, rather than just whitespace. Any similar types of characters?
Sure. You could have comments, you could probably have all sorts of weird stuff. Granted that comments would be pathological, but no more than a table named "-on-". There may be additional non-pathological cases.
Or maybe we're overcomplicating this...
/((\S{,64}|`[^`]{,64}`).)?(\S{,64}|`[^`]{,64}`)/iS All we really want is to know if the input is valid. And all we're trying to catch is:
- table
- database.table
And the variants of them with some amount of backticks.
Better version:
/^\s*([a-z0-9_]+.|`[^`]+`.)?([a-z0-9_]+|`[^`]+`)\s*$/i
There's no reason to complicate the regex with explicit maximum lengths. If additional characters are allowed in unquoted table names, add those to the [a-z0-9_] class. This should avoid all valid joins and similar. (It still fails on comments, though! :P )
"/^\s*(\S+|`[^`]+`)(.\S+|.`[^`]+`)?\s*$/i" All we need to test for is non-whitespace. Anything which is completely unbroken by whitespace can't be a query at all. "SELECT*FROMfoo" ;) And the backquote testing makes sure that `...` still passes and you can still pass ` foo ` in and have it work. The second group is there so that `foo`.bar, and foo.`bar` don't break. And I handily moved the condition to the second group to optimize the regex. The \S+ alone will completely match 'foo.bar' without any backtracking or anything from the processor. Though, considering your \s*'s I'll probably have to run the table and database through trim.
Yes " is usable if ANSI_QUOTES is enabled, but that is in no way widely compatible with MySQL itself. So none of the code should ever output that.
Well, technically the code isn't outputting it, it's just passing it to the abstraction layer. I figured we should be generous. Notice that I also put in the MSSQLism [table name]. Not strictly any point, you're right, but it gives me a warm and fuzzy feeling to permit standards compliance in what we pass to abstraction layers, at least, even if MySQL sucks too much to allow it to actually be run. :)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)