Transaction control in procedures

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Transaction control in procedures
Date: 2017-10-31 19:38:05
Message-ID: 178d3380-0fae-2982-00d6-c43100bc8748@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here is a patch that implements transaction control in PL/Python
procedures. (This patch goes on top of "SQL procedures" patch v1.)

So you can do this:

CREATE PROCEDURE transaction_test1()
LANGUAGE plpythonu
AS $$
for i in range(0, 10):
plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i)
if i % 2 == 0:
plpy.commit()
else:
plpy.rollback()
$$;

CALL transaction_test1();

I started with PL/Python because the internal structures there are more
manageable. Obviously, people will want this for PL/pgSQL as well, and
I have that in the works. It's not in a usable state, but I have found
that the work needed is essentially the same as in PL/Python for example.

I have discovered three groups of obstacles that needed addressing to
make this work. At this point, the patch is more of a demo of what
these issues are, and if we come to satisfactory solutions for each of
them, things should fall into place more easily afterwards.

1) While calling CommitTransactionCommand() in the middle of a utility
command works just fine (several utility commands do this, of course),
calling AbortCurrentTransaction() in a similar way does not. There are
a few pieces of code that think that a transaction abort will always
result in a return to the main control loop, and so they will just clean
away everything. This is what the changes in portalmem.c are about.
Some comments there already hint about the issue. No doubt this will
need further refinement. I think it would be desirable in general to
separate the control flow concerns from the transaction management
concerns more cleanly.

2) SPI needs some work. It thinks that it can clean everything away at
transaction end. I have found that instead of TopTransactionContext one
can use PortalContext and get a more suitable life cycle for the memory.
I have played with some variants to make this configurable (e.g.,
argument to SPI_connect()), but that didn't seem very useful. There are
some comments indicating that there might not always be a PortalContext,
but the existing tests don't seem to mind. (There was a thread recently
about making a fake PortalContext for autovacuum, so maybe the current
idea is that we make sure there always is a PortalContext.) Maybe we
need another context like StatementContext or ProcedureContext.

There also needs to be a way via SPI to end transactions and allowing
*some* cleanup to happen but leaving the memory around. I have done
that via additional SPI API functions like SPI_commit(), which are then
available to PL implementations. I also tried making it possible
calling transaction statements directly via SPI_exec() or similar, but
that ended up a total disaster. So from the API perspective, I like the
current implementation, but the details will no doubt need refinement.

3) The PL implementations themselves allocate memory in
transaction-bound contexts for convenience as well. This is usually
easy to fix by switching to PortalContext as well. As you see, the
PL/Python code part of the patch is actually very small. Changes in
other PLs would be similar.

Two issues have not been addressed yet:

A) It would be desirable to be able to run commands such as VACUUM and
CREATE INDEX CONCURRENTLY in a procedure. This needs a bit of thinking
and standards-lawyering about the semantics, like where exactly do
transactions begin and end in various combinations. It will probably
also need a different entry point into SPI, because SPI_exec cannot
handle statements ending transactions. But so far my assessment is that
this can be added in a mostly independent way later on.

B) There needs to be some kind of call stack for procedure and function
invocations, so that we can track when we are allowed to make
transaction controlling calls. The key term in the SQL standard is
"non-atomic execution context", which seems specifically devised to
cover this scenario. So for example, if you have CALL -> CALL -> CALL,
the third call can issue a transaction statement. But if you have CALL
-> SELECT -> CALL, then the last call cannot, because the SELECT
introduces an atomic execution context. I don't know if we have such a
thing yet or something that we could easily latch on to.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v1-0001-Transaction-control-in-PL-Python-procedures.patch text/plain 13.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-10-31 19:46:32 Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Previous Message Tom Lane 2017-10-31 19:09:11 Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM