Nested transactions support for code composability

From: Daniel Fortunov <psycopg-list(at)danielfortunov(dot)com>
To: psycopg(at)postgresql(dot)org
Subject: Nested transactions support for code composability
Date: 2017-01-16 23:26:59
Message-ID: CAH1rg6YS5n6PjiV=9cBr2E--oKXtUuL=S7fNPs+Y6BuwRA0iGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: psycopg

I'd like to implement support for nested transactions in psycopg2 using a
context manager that internally uses postgres savepoints to implement the
ability to nest transactions within each other, with sensible commit and
rollback semantics.

The end goal is to allow code composability through support for nested
transactions.

Before I go ahead and implement this I wanted to solicit feedback on the
suggestion, because I find it hard to imagine I'm the first person to dream
up this idea, so I imagine I'm missing something. (Perhaps it's already
implemented somewhere that I've not found, or maybe it's not implemented
because it's a terrible idea, and if so I'd like to understand why!)

The rest of this email describes what I'm trying to achieve, using some
motivating examples.

Please share your thoughts and suggestions.

Thanks in advance,
Daniel Fortunov

------------------

Suppose we have a database storing data about widgets, and we'd like to
implement a function that does some updates on a few related widget tables.
We'd like to do this atomically, so we wrap it in a transaction.

Let's assume that `Transaction` is a psycopg transaction context manager
with support for nested transactions. It begins a transaction on __enter__,
commits it on __exit__, and rolls back if there is an exception. (This is a
simplified version of what I propose to implement.)

def update_widget(cxn, widget):
with Transaction(cxn):
cur = cxn.cursor()
cur.execute(...) # Update table A
cur.execute(...) # Update table B
cur.execute(...) # Update table C

So far this works fine, and is also currently achievable with the
'autocommit' paradigm. Nothing new here. Anyone can call this code, provide
a connection, and get an atomic update of tables A, B, and C.

Now, I'd like to write a maintenance job that updates a whole load of
widgets. If the updates on any one widget fail, I'd still like the job to
continue, and process as many widgets as it can. Specifically, I want to
implement this job _without_ changing the existing maintain_widget()
function, which is already in use.

maintain_widgets(cxn):
with Transaction(cxn):
for widget in widgets:
try:
update_widget(cxn, widget)
except:
pass # log exception and continue with other widgets

The Transaction context manager would implement the begin/commit/rollback
transaction semantics by beginning/releasing/rolling back to a savepoint,
respectively. In this context, update_widget would seamlessly switch to
using savepoints is it is not the outermost transaction.

This is not possible to achieve with the 'autocommit' paradigm, because you
need the ability to rollback the update of a single widget, and then
continue the transaction that updates the rest. It is possible to implement
with explicit savepoints, however that requires maintain_widget() to be
updated, and breaks existing code, because it will no longer be possible to
call maintain_widget outside of a transaction!

Now I want to take it a level further, and write a database integration
test for my widget maintainer script, which executes the test scenario
within a transaction, guaranteeing that the database is left untouched
after my test, regardless of whether the test succeeds, fails, or aborts
unexpectedly (e.g. maybe the test is running in an interactive debugger and
the I abort the debugging session! :-)

def test_maintain_widget_successful_run(self):
with Transaction(cxn) as txn:
# Set up starting database state
# ....
maintaint_widgets(cxn) # Call code under test
# Check database state against expectations
# ...
txn.rollback() # Rollback all changes

Note that for this final ROLLBACK to work, none of the called code is
allowed to execute COMMIT at any point. Which is why this breaks down
completely with the autocommit = False; cxn.commit() paradigm, because we'd
be calling commit() explicitly all over the place!

With the current paradigm you're forced to manage transactions at the
uppermost level, which means callers to maintain_widgets() are obliged to
call it within a pre-existing transaction in order to get correct behaviour
(namely, atomic update of all three tables in the face of errors). And even
then, this precludes having a test one level above, which executes that
"uppermost level" code within a transaction and then rolls everything back!

So, what am I missing? Am I the first to have this crazy idea? Is this
already implemented? How do others solve the problem of code composability
and transactions? Is this a terrible idea? (if so, why?)

Feedback and comments welcome.
Thanks for reading this far!

Responses

Browse psycopg by date

  From Date Subject
Next Message Christophe Pettus 2017-01-16 23:29:53 Re: Nested transactions support for code composability
Previous Message Matthew Rocklin 2017-01-16 23:24:42 Re: Turbo ODBC