Transaction nesting and savepoints

List overview All Threads
Download

newer

older

Community vs. centralized...

Bugzilla Weekly Report

Platonides

31 Aug 2010 31 Aug '10

8:08 a.m.

There are some places in our code where we are nesting transactions.

A code like: $dbw->begin(); $dbw->insert( 'image', ... ); $article->doEdit( $pageText, $comment, EDIT_NEW | EDIT_SUPPRESS_RC ); $dbw->query( "UPDATE $site_stats SET ss_images=ss_images+1" ); $dbw->commit();

Is apparently safely grouped into a transaction, but the UPDATE won't be in the same transaction as the insert, since Article::doEdit will itself open a transaction, breaking the old one. This is an unexpected result, and hard to realise, should the second transaction lie several calls down (or even inside a hook!).

The transaction break, common to all our supported databases is not even consistent among them: * Mysql will commit the open transaction if you begin another one [1]. * Postgres will ignore the second begin (and issue a warning) if you are inside a transaction [2]. * In sqlite an attempt to invoke the BEGIN command within a transaction will fail with an error [3] (our driver forces a commit there, matching mysql behavior). * Oracle is not using a specific statement in our driver begin(). Calling begin() inside another transaction will be silently ignored (matches postgres). * Mssql only takes into account the outermost transaction [4] (matches postgres). * Ibm_db2 driver is changing the autocommit property like Oracle, thus matching postgres.

I propose changing Database::begin() / commit() / rollback() to keep the count in mTrxLevel and perform a savepoint instead of a BEGIN should it be called inside another one.

Savepoints are a way to perform partial rollbacks inside a bigger transaction. They are supported by all our databases [5] [6] [7] [8] [9] [10]. We shouldn't even have problems with our 4.0.30 servers, since its support was added in mysql 4.0.14 [11].

1-http://dev.mysql.com/doc/refman/5.1/en/implicit-commit.html 2-http://www.postgresql.org/docs/9.0/static/sql-begin.html 3-http://sqlite.org/lang_transaction.html 4-http://msdn.microsoft.com/en-us/library/ms188929.aspx

5-http://dev.mysql.com/doc/refman/5.1/en/savepoint.html 6-http://www.postgresql.org/docs/9.0/static/sql-savepoint.html 7-http://sqlite.org/lang_savepoint.html 8-http://www.dba-oracle.com/t_savepoint.htm 9-http://msdn.microsoft.com/en-us/library/ms378414.aspx 10-http://publib.boulder.ibm.com/infocenter/db2luw/v9/topic/com.ibm.db2.udb.adm... 11-http://dev.mysql.com/doc/refman/4.1/en/savepoint.html

Show replies by date

31 Aug 31 Aug

8:59 a.m.

On Tue, Aug 31, 2010 at 10:08 AM, Platonides Platonides@gmail.com wrote:

...

I propose changing Database::begin() / commit() / rollback() to keep the count in mTrxLevel and perform a savepoint instead of a BEGIN should it be called inside another one.

FWIW, I think I hacked one of those locally because I got tired of the PG log being spammed with 'not in a transaction' warnings since, likely due to this issue, MW calls $dbw->commit() regardless of the txn level.

The PG driver already (ab)uses savepoints to emulate MySQL since MW uses INSERT IGNORE in quite a few places.

Aryeh Gregor

10:09 a.m.

On Tue, Aug 31, 2010 at 11:08 AM, Platonides Platonides@gmail.com wrote:

...

I propose changing Database::begin() / commit() / rollback() to keep the count in mTrxLevel and perform a savepoint instead of a BEGIN should it be called inside another one.

Savepoints are a way to perform partial rollbacks inside a bigger transaction. They are supported by all our databases [5] [6] [7] [8] [9] [10]. We shouldn't even have problems with our 4.0.30 servers, since its support was added in mysql 4.0.14 [11].

I think there would be performance issues with this. Specifically, there are a number of places where we do a commit because we need to release locks immediately to avoid contention. It looks like the locks are held until the transaction commit when savepoints are used, so this might make us hold a lot of locks for much too long.

DatabaseBase should definitely be DBMS-neutral here, though, and should not depend on MySQL's behavior. I suggest that begin() check the transaction level, and if there's an open transaction, it should do an explicit commit() first and wfWarn(). This kind of thing is a bug -- it will lead to less atomicity than it looks like we have from casual code inspection, and these cases should be looked at and fixed if possible.

Maciej Jaros

11:42 p.m.

At 2010-08-31 19:09, Aryeh Gregor wrote:

...

On Tue, Aug 31, 2010 at 11:08 AM, PlatonidesPlatonides@gmail.com wrote:

...
I propose changing Database::begin() / commit() / rollback() to keep the count in mTrxLevel and perform a savepoint instead of a BEGIN should it be called inside another one.

Savepoints are a way to perform partial rollbacks inside a bigger transaction. They are supported by all our databases [5] [6] [7] [8] [9] [10]. We shouldn't even have problems with our 4.0.30 servers, since its support was added in mysql 4.0.14 [11].

I think there would be performance issues with this. Specifically, there are a number of places where we do a commit because we need to release locks immediately to avoid contention. It looks like the locks are held until the transaction commit when savepoints are used, so this might make us hold a lot of locks for much too long. DatabaseBase should definitely be DBMS-neutral here, though, and should not depend on MySQL's behavior. I suggest that begin() check the transaction level, and if there's an open transaction, it should do an explicit commit() first and wfWarn(). This kind of thing is a bug -- it will lead to less atomicity than it looks like we have from casual code inspection, and these cases should be looked at and fixed if possible.

I think it still should be conscious decision and so those functions could use their first... hm... second parameter as the transaction name. For MS SQL you can simply use BEGIN TRANSACTION [name] (I think it would be more natural), for MySQL I guess savepoints should be used.

Regards, Nux.

Aryeh Gregor

1 Sep 1 Sep

8:03 a.m.

On Wed, Sep 1, 2010 at 2:42 AM, Maciej Jaros egil@wp.pl wrote:

...

I think it still should be conscious decision and so those functions could use their first... hm... second parameter as the transaction name. For MS SQL you can simply use BEGIN TRANSACTION [name] (I think it would be more natural), for MySQL I guess savepoints should be used.

I'm not sure what you're saying here. Are you suggesting some solution where commit() does not actually always commit the current transaction? If so, as I said, this is a problem because it will cause locks to be held for much too long in some cases.

Maciej Jaros

11:13 a.m.

At 2010-09-01 17:03, Aryeh Gregor wrote:

...

On Wed, Sep 1, 2010 at 2:42 AM, Maciej Jarosegil@wp.pl wrote:

...
I think it still should be conscious decision and so those functions could use their first... hm... second parameter as the transaction name. For MS SQL you can simply use BEGIN TRANSACTION [name] (I think it would be more natural), for MySQL I guess savepoints should be used.

I'm not sure what you're saying here. Are you suggesting some solution where commit() does not actually always commit the current transaction? If so, as I said, this is a problem because it will cause locks to be held for much too long in some cases.

Not exactly. If you know you can and should commit transaction called for example "article_update" then it's OK. If you want to commit "image_insert_and_update" then it's OK too. If you are making a commit for everything that is started then it doesn't seem OK as pointed out by Platonides in his example.

Aryeh Gregor

1:53 p.m.

On Wed, Sep 1, 2010 at 2:13 PM, Maciej Jaros egil@wp.pl wrote:

...

Not exactly. If you know you can and should commit transaction called for example "article_update" then it's OK. If you want to commit "image_insert_and_update" then it's OK too. If you are making a commit for everything that is started then it doesn't seem OK as pointed out by Platonides in his example.

When it comes to locks, you can't commit only some things and not others, at least not on MySQL. Locks are only released if you commit everything, so to release locks, you do need to commit everything. I don't think we can avoid this, but we can raise warnings so that we know where the places are, and see if we can figure out some way to fix them -- like by moving the sensitive queries to the very end of the transaction and removing the nested transaction.

I don't think naming is necessary. As long as begin() is always paired correctly with commit()/rollback(), we should be fine. Adding a name parameter would make sense as an error-catching measure, though.

Platonides

3:47 p.m.

Aryeh Gregor wrote:

...

On Wed, Sep 1, 2010 at 2:42 AM, Maciej Jaros egil@wp.pl wrote:

...
I think it still should be conscious decision and so those functions could use their first... hm... second parameter as the transaction name. For MS SQL you can simply use BEGIN TRANSACTION [name] (I think it would be more natural), for MySQL I guess savepoints should be used.

I'm not sure what you're saying here. Are you suggesting some solution where commit() does not actually always commit the current transaction? If so, as I said, this is a problem because it will cause locks to be held for much too long in some cases.

Is there some way to find out those points were the commit needs to be there for lock freeing and not just for normal transaction finish ?

Aryeh Gregor

4:04 p.m.

On Wed, Sep 1, 2010 at 6:47 PM, Platonides Platonides@gmail.com wrote:

...

Is there some way to find out those points were the commit needs to be there for lock freeing and not just for normal transaction finish ?

Look at every single commit() and guess? Alternatively, we could just assume none of them are for releasing locks, skip them and see what Domas comments out to fix the site when it breaks. ;)

Maciej Jaros

5 Sep 5 Sep

4:21 p.m.

At 2010-09-02 01:04, Aryeh Gregor wrote:

...

On Wed, Sep 1, 2010 at 6:47 PM, PlatonidesPlatonides@gmail.com wrote:

...
Is there some way to find out those points were the commit needs to be there for lock freeing and not just for normal transaction finish ?

Look at every single commit() and guess? Alternatively, we could just assume none of them are for releasing locks, skip them and see what Domas comments out to fix the site when it breaks. ;)

Maybe this could be done step by step by adding named transactions and some other other method/parameter like commitToUnlockTables so the developers using it would know what it is used for ;-). Commit (in general) is obviously not always done to simply unlock tables. Some might not expect changes be committed already and might want to rollback but they would be unable to.

So until proper handling of transactions would be implemented warnings might be thrown if begin/commit would not be called without proper attributes. An additional warning might be thrown if more then say... 3 nested transactions would be started as it would probably mean locking too many tables.

This approach would catch this: begin transaction outer_tran update tbl1 set name='1234' --locks tbl1 begin transaction inner_tran update tbl2 set name='1234' -- locks tbl2 begin transaction inner_inner_tran -- -> throw warning: too many open tran update tbl3 set name='1234' -- locks tbl3 commit transaction inner_inner_tran commit transaction inner_tran commit transaction outer_tran

But then again it wouldn't catch this: begin transaction outer_tran update tbl1 set name='1234' --locks tbl1 begin transaction inner_tran update tbl2 set name='1234' -- locks tbl2 commit transaction inner_tran begin transaction other_inner_tran update tbl3 set name='1234' -- locks tbl3, but no warning is thrown (and yes, MSSQL keeps tbl2 locked too) commit transaction other_inner_tran commit transaction outer_tran

And so (with the exception of installation and upgrade process) warnings might be thrown when you actually run to many inserts/updates after most outer transaction was started. Something like this would work I think: if ($this->trxLevel()>0 && $this->isWriteQuery( $sql )) { $this->mNumWritesSinceTran++;// set to 0 on tran end if ($this->mNumWritesSinceTran>TOO_MANY_WRITES_SINCE_TRAN) { wfWarn( "Too many writes since most outer transaction started ({$this->mNumWritesSinceTran}). Avoid locking tables when possible." ); } }

Regards, Nux.

Aryeh Gregor

6 Sep 6 Sep

11:03 a.m.

On Sun, Sep 5, 2010 at 7:21 PM, Maciej Jaros egil@wp.pl wrote:

...

Maybe this could be done step by step by adding named transactions and some other other method/parameter like commitToUnlockTables so the developers using it would know what it is used for ;-). Commit (in general) is obviously not always done to simply unlock tables. Some might not expect changes be committed already and might want to rollback but they would be unable to.

The thing is, we know the status quo works. If we change behavior so that locks (these are row locks for InnoDB, BTW, not table locks) aren't released quickly enough, the worst case is it might take down the site, or at least significantly increase deadlock errors. So it's best to be a bit cautious here.

Maybe I'm being overly pessimistic, though. Perhaps Domas or Tim could share an opinion.

...

So until proper handling of transactions would be implemented warnings might be thrown if begin/commit would not be called without proper attributes. An additional warning might be thrown if more then say... 3 nested transactions would be started as it would probably mean locking too many tables.

No, that's not how it works. The problem is even if you run a single query that locks a bunch of rows that are likely to be accessed. For instance, if you run a query that updates a row of the page table, then any other changes to those rows will have to wait until the transaction is committed, and if it waits too long, InnoDB might decide there's a deadlock and start rolling back transactions with an error (which will typically get reported to the user).

...

if ($this->trxLevel()>0 && $this->isWriteQuery( $sql )) { $this->mNumWritesSinceTran++;// set to 0 on tran end if ($this->mNumWritesSinceTran>TOO_MANY_WRITES_SINCE_TRAN) {

The number of queries doesn't matter. One query that updates a lot of rows in a key table could be a problem, dozens that update rows in rarely-accessed tables might not be a problem. They need to be identified manually and/or by actually trying them. The latter is bad because it risks taking the site down.

Put it this way: what's being proposed here will have the overall effect of making transactions longer. Longer transactions means more locks, which means more lock contention, which means worse performance. Atomicity needs to be weighed against performance here -- just removing all the extra transaction commits is not wise. I'm thinking a good approach would be:

1) Give a $fname parameter to begin(), commit(), and rollback(), and update everything to pass __METHOD__ (or __METHOD__ . '-something' if there are multiples in the function). Keep the opener of the current transaction in some member variable.

2) If begin() is called when a transaction is already in process, run commit() first (so we match MySQL's current behavior on all DBMSes, for consistency) and do wfWarn(), providing a list of the caller of begin(), and the caller of begin() for the in-progress transaction. If wfWarn() gets triggered too much, switch it to wfDebug().

Then we can see what's causing premature transaction commits and work on fixing them one by one. In many cases, it might be straightforward to refactor them to remove the nesting without holding transactions open for longer than they should be -- it would have to be decided on a case-by-case basis. If it turns out there are lots of places where we really want to use checkpoints or something that gives us nested-transaction-like behavior, we can talk about that then.

5235

Age (days ago)

5241

Last active (days ago)

wikitech-l@lists.wikimedia.org

10 comments

4 participants

tags (0)

participants (4)

Aryeh Gregor
Maciej Jaros
OQ
Platonides