I did think of \b, however I'm wary of it. I tied the test to whitespace or starting/ending of string because of the remote case where someone has something like -on- as their table name. Because \b matches breaks there are a fair number of symbols which it also considers to be word breaks because they are not word characters. And it's possible for some of those to be valid inside of table names.
^_^ Actually your python example is kinda interesting. I was actually contemplating something not deep like what you were mentioning, but something which does do some sort of overloading. It came from the usage of tableName to generate table names with aliases. Imagine if tableName, or something else returned a object rather than a flat string. And that object could contain a table name, and an alias for that. It would even work beautifully in current uses because it would override the toString method and when used in string concatenation it would output a valid SQL string, with the Alias to if necessary. And could have extra methods for any special type of syntax for table alias pairs. We could even tweak tableName to accept an optional alias. And tweak the inputs of the other functions to allow some special table alias pair.
But yes... It is a bit of a tangent. However, it is worth looking at as a side project. Honestly, I see piles of shitty database abstraction systems which try and rewrite your SQL to make them fully 'cross database compatible' when most people consider that a load of bullshit because no matter how much you abstract you still need to understand a system you migrate to and it doesn't just magically work. But what I don't see a single instance of, is what MediaWiki has done a bit of. Converting the construction of SQL in strings, into a set of helper functions which aid in readability, portability, and cross-compliance of most sql calls. Which even have the amazing bonus of properly escaping things where they come in to eliminate most of the issues of accidentally forgetting to secure something which could be used for SQL injection. Honestly, would you rather call $db->select, or have to remember to mysql_real_escape_string every bit of data manually? The former always escapes cleanly and avoids having you forget to do it.
We may also want to add some helpers for a few things people might want to do in conditions. Basically IN, AND, OR, >, <, not, and a few other things I've found myself needing to construct a mini condition string rather than using the nice assignment that escapes everything.
~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com)
Simetrical wrote:
On Wed, May 7, 2008 at 8:44 PM, DanTMan dan_the_man@telus.net wrote:
Well... I was worried about people with tables like "onsomething" or whatever... But I suppose I can test strictly for things with whitespace around them "/(^|\s)(ON|JOIN)(\s|$)/i" that should solve the issue.
Or use \b, that's what it's there for.
But I still think 3 is good. IMHO we should probably do 5 and 3 for best practice and avoidance of future issues.
Maybe, given how the API has its own database handling layer anyway, seemingly.
Actually, I used to do something to the Database class. I actually had a tweak for select and a few other functions. I basically added a extra parameter $doQuery to the end which defaulted to true. Basically it meant that you could explicitly send false and have select return a SQL statement for you instead. That way you could go as far as to use select, then manually UNION them together. But what would be absolutely beautiful would be: list( $user, $ug, $ugS) = $db->tableNameN('user', 'user_groups', $ug = $db->tableName("`$wgSharedDB`.`{$wgSharedPrefix}user_groups`"); $res = $db->union( $db-join( $db->select( $user, conds, etc..., /*doQuery=*/false ), $ug, conds, etc..., /*doQuery=*/false ), $db-join( $db->select( $user, conds, etc..., /*doQuery=*/false ), $ugS, conds, etc..., /*doQuery=*/false ), ); ^_^ Which basically, would have select return sql which join uses to construct a join statement for. And then union, well... Unions them together.
Operator overloading would be the awesomest for this. Consider this Python statement:
res = SELECT + (field.field1, field.field2) + FROM( table.user, table.page ) + WHERE( field.page_title == field.user_name and field.page_namespace == NS_USER and field.user_id == 1729 ) + UNION + ...
with all operators (+, ==, and) overloaded, and SELECT/FROM/WHERE/UNION object names, and "field" and "table" dummy objects that just exist for __get__ methods (so table.user would be the same as $dbr->tableName( 'user' )).
Of course, I don't seriously suggest such a thing, so I guess it's off-topic, but I've pondered it before and thought it would be fun to mention as a counterpoint. The fact that it's possible is somewhat mind-boggling. Maybe an argument against operator overloading? :)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l