Hi all!
I recently found that it is less than clear how numbers should be quoted/escaped in SQL queries. Should DatabaseBase::addQuotes() be used, or rather just inval(), to make sure it's really a number? What's the best practice?
Looking at DatabaseBase::makeList(), it seems that addQuotes() is used on all values, string or not. So, what does addQuotes() actually do? Does it always add quotes, turning the value into a string literal, or does it rather apply whatever quoting/escaping is appropriate for the given data type?
addQuotes' documentation sais:
* If it's a string, adds quotes and backslashes * Otherwise returns as-is
That's a plain LIE. Here's the code:
if ( $s === null ) { return 'NULL'; } else { # This will also quote numeric values. This should be harmless, # and protects against weird problems that occur when they really # _are_ strings such as article titles and string->number->string # conversion is not 1:1. return "'" . $this->strencode( $s ) . "'"; }
So, it actually always returns a quoted string literal, unless $s is null.
But is it true what the comment sais? Is it really always harmless to quote numeric values? Will all database engines always magically convert them to numbers before executing the query? If not, this may be causing table scans. That would be bad - but I suppose someone would have noticed by now...
So... at the very least, addQuotes' documentation needs fixing. And perhaps it would be nice to have an additional method that only applies the appropriate quoting, e.g. escapeValue or some such - that's how addQuotes seems to be currently used, but that's not what it actually does... What do you think?
-- daniel
PS: There's more fun. The DatabaseMssql class overrides addQuotes to support Blob object. For the case $s is a Blob object, this code is used:
return "'" . $s->fetch( $s ) . "'";
The value is used raw, without any escaping. Looks like if there's a ' in the blob, fun things will happen. Or am I missing something?
I can't speak for all databases, but I know MySQL fairly well, and it will generally be safe there.
You will be able to find some edge cases where it may not give you what you expected. For example select 0x10 + 1; is not the same as select '0x10' + 1;
The first will give 17, the second 1 as the string will get converted to 0, but I don't know that there is a lot of non-decimal math being done here.
In most cases though, type conversion will work as you expect it too, the following will all give 2 select '1' + '1' select '1' + 1 select 1+ 1
Also, if we use MySQL's query cache, it is (or at least was) fairly naive and compared query strings for an exact match so you will get a cache miss if you run the same query twice but only include redundant quotes one time.
Quoting nulls would be a problem, but it is not doing that.
More importantly, the function strencode() called inside that addQuotes() function is overridden for different underlying databases. The MySQL one looks good. It is using mysql_real_escape_string and providing the connection so the correct character set will be used.
It looks to me like the MS SQL version requires the programmer to manually call encodeBlob() before building their query if they know they are inserting binary data.
Luke Welling
On Tue, Dec 4, 2012 at 4:52 AM, Daniel Kinzler daniel@brightbyte.de wrote:
Hi all!
I recently found that it is less than clear how numbers should be quoted/escaped in SQL queries. Should DatabaseBase::addQuotes() be used, or rather just inval(), to make sure it's really a number? What's the best practice?
Looking at DatabaseBase::makeList(), it seems that addQuotes() is used on all values, string or not. So, what does addQuotes() actually do? Does it always add quotes, turning the value into a string literal, or does it rather apply whatever quoting/escaping is appropriate for the given data type?
addQuotes' documentation sais:
* If it's a string, adds quotes and backslashes * Otherwise returns as-is
That's a plain LIE. Here's the code:
if ( $s === null ) { return 'NULL'; } else { # This will also quote numeric values. This should be harmless, # and protects against weird problems that occur when they really # _are_ strings such as article titles and string->number->string # conversion is not 1:1. return "'" . $this->strencode( $s ) . "'"; }
So, it actually always returns a quoted string literal, unless $s is null.
But is it true what the comment sais? Is it really always harmless to quote numeric values? Will all database engines always magically convert them to numbers before executing the query? If not, this may be causing table scans. That would be bad - but I suppose someone would have noticed by now...
So... at the very least, addQuotes' documentation needs fixing. And perhaps it would be nice to have an additional method that only applies the appropriate quoting, e.g. escapeValue or some such - that's how addQuotes seems to be currently used, but that's not what it actually does... What do you think?
-- daniel
PS: There's more fun. The DatabaseMssql class overrides addQuotes to support Blob object. For the case $s is a Blob object, this code is used:
return "'" . $s->fetch( $s ) . "'";
The value is used raw, without any escaping. Looks like if there's a ' in the blob, fun things will happen. Or am I missing something?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
MySQL very happily handles quoted numbers (with the exception of LIMIT values), but iirc the SQL standard says numeric types shouldn't be quoted. I think db2 will return an error if you quote a number, for example. So to keep the db-specific code to a minimum, we may want to update that.
On Tue, Dec 4, 2012 at 1:52 AM, Daniel Kinzler daniel@brightbyte.de wrote:
Hi all!
I recently found that it is less than clear how numbers should be quoted/escaped in SQL queries. Should DatabaseBase::addQuotes() be used, or rather just inval(), to make sure it's really a number? What's the best practice?
Looking at DatabaseBase::makeList(), it seems that addQuotes() is used on all values, string or not. So, what does addQuotes() actually do? Does it always add quotes, turning the value into a string literal, or does it rather apply whatever quoting/escaping is appropriate for the given data type?
addQuotes' documentation sais:
* If it's a string, adds quotes and backslashes * Otherwise returns as-is
That's a plain LIE. Here's the code:
if ( $s === null ) { return 'NULL'; } else { # This will also quote numeric values. This should be harmless, # and protects against weird problems that occur when they really # _are_ strings such as article titles and string->number->string # conversion is not 1:1. return "'" . $this->strencode( $s ) . "'"; }
So, it actually always returns a quoted string literal, unless $s is null.
But is it true what the comment sais? Is it really always harmless to quote numeric values? Will all database engines always magically convert them to numbers before executing the query? If not, this may be causing table scans. That would be bad - but I suppose someone would have noticed by now...
So... at the very least, addQuotes' documentation needs fixing. And perhaps it would be nice to have an additional method that only applies the appropriate quoting, e.g. escapeValue or some such - that's how addQuotes seems to be currently used, but that's not what it actually does... What do you think?
-- daniel
PS: There's more fun. The DatabaseMssql class overrides addQuotes to support Blob object. For the case $s is a Blob object, this code is used:
return "'" . $s->fetch( $s ) . "'";
The value is used raw, without any escaping. Looks like if there's a ' in the blob, fun things will happen. Or am I missing something?
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Such database should override it. Although it shouild probably return the numbers, yes. The problems of "strings such as article titles and string->number->string" should not appear, since you would provide them as "1234", not 1234.
wikitech-l@lists.wikimedia.org