Hi Tim
Thanks for bringing some light into the DBO_TRX stuff. Seems like few knew it existed, and hardly anyone understood what it means or how it should be used.
I'll give my thoughts inline and propose a solution at the bottom.
On 25.09.2012 13:38, Tim Starling wrote:
On 25/09/12 19:33, Daniel Kinzler wrote:
So, can someone shed light on what DBO_TRX is intended to do, and how it is supposed to work?
Maybe you should have asked that before you broke it with I8c0426e1.
Well, I did ask about nested transactions on the list. Nobody mentioned the scheme you describe. Is it documented somewhere?
Anyway, I just added warnings, the behavior didn't change.
DBO_TRX provides the following benefits:
- It provides improved consistency of write operations for code which
is not transaction-aware, for example rollback-on-error.
But it *breaks* write consistency for code that *is* transaction aware. Calling begin() will prematurely commit the already open transaction.
- It provides a snapshot for consistent reads, which improves
application correctness when concurrent writes are occurring.
Ok.
Initially, I set up a scheme where transactions were "nested", in the sense that begin() incremented the transaction level and commit() decremented it. When it was decremented to zero, an actual COMMIT was issued. So you would have a call sequence like:
- begin() -- sends BEGIN
- begin() -- does nothing
- commit() -- does nothing
- commit() -- sends COMMIT
This scheme soon proved to be inappropriate, since it turned out that the most important thing for performance and correctness is for an application to be able to commit the current transaction after some particular query has completed. Database::immediateCommit() was introduced to support this use case -- its function was to immediately reduce the transaction level to zero and commit the underlying transaction.
Ok.
When it became obvious that that every Database::commit() call should really be Database::immediateCommit(), I changed the semantics, effectively renaming Database::immediateCommit() to Database::commit(). I removed the idea of nested transactions in favour of a model of cooperative transaction length management:
- Database::begin() became effectively a no-op for web requests and
was sometimes omitted for brevity.
But it isn't! It's not a no-op if there is an active transaction! It *breaks* the active transaction! I think that is the crucial point here.
- Database::commit() should be called after completion of a sequence
of write operations where atomicity is desired, or at the earliest opportunity when contended locks are held.
Ok, so it's basically a savepoint.
Using that scheme, a new transaction should be started immediately after the commit(). I guess when DBO_TRX is set, query() will do that.
In cases where transactions end up being too short due to the need for a called function to commit a transaction when the caller already has a transaction open, it is the responsibility of the callers to introduce some suitable abstraction for serializing the transactions.
In the presence of hooks implemented by extensions, this frankly seems impossible. Also, it would require functions to document exactly if and when they are using transactions, and hooks have to document whether implementors can use transactions.
Currently, the only safe way for an extension to use transactions seems to be to check the trxLevel explicitly, and only start a transaction if there is none already in progress. Which effectively brings us back to the level-counting scheme.
When transactions too long, you hit performance problems due to lock contention.
Yes... which makes me wonder why it's a good idea to start a transaction upon the first select, even for requests that do not write to the database at all.
When transactions are too short, you hit consistency problems when requests fail. The scheme I introduced favours performance over consistency. It resolves conflicts between callers and callees by using the shortest transaction time. I think was an appropriate choice for Wikipedia, both then and now, and I think it is probably appropriate for many other medium to high traffic wikis.
In terms of performance, perhaps it would be feasible to use short transactions with an explicit begin() with savepoints for nesting. But then you would lose the consistency benefits of DBO_TRX that I mentioned at the start of this post.
I'm trying to think of a way to implement this scheme without breaking transactions and causing creeping inconsistencies.
For the semantics you propose, begin() and commit() seem to be misleading names - flush() would be more descriptive of what is going on, and implies no nesting.
So, how about this:
* normally, flush() will commit any currently open DBO_TRX transaction (and the next query will start a new one). In cli (auto-commit) mode, it would do nothing.
* begin() and end() (!) use counted levels. In auto-commit mode, the outer-most level of them starts and commits a transaction.
* Between begin and end, flush does nothing.
This would allow us to use the "earliest commit" semantics, but also control blocks of DB operations that should be consistent and atomic. And it would make it explicit to programmers what they are doing.
-- daniel
PS: for the time being, I suggest to just disable the transaction warnings if the DBO_TRX flag is set. That lets us keep warnings about the use of nested transactions in unit tests without flooding the logs.