Rearranging ALTER TABLE to avoid multi-operations bugs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Rearranging ALTER TABLE to avoid multi-operations bugs
Date: 2019-05-26 22:23:48
Message-ID: 10365.1558909428@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

We've had numerous bug reports complaining about the fact that ALTER TABLE
generates subsidiary commands that get executed unconditionally, even
if they should be discarded due to an IF NOT EXISTS or other condition;
see e.g. #14827, #15180, #15670, #15710. In [1] I speculated about
fixing this by having ALTER TABLE maintain an array of flags that record
the results of initial tests for column existence, and then letting it
conditionalize execution of subcommands on those flags. I started to
fool around with that concept today, and quickly realized that my
original thought of just adding execute-if-this-flag-is-true markers to
AlterTableCmd subcommands was insufficient. Most of the problems are with
independent commands that execute before or after the main AlterTable,
and would not have any easy connection to state maintained by AlterTable.

The way to fix this, I think, is to provide an AlterTableCmd subcommand
type that just wraps an arbitrary utility statement, and then we can
conditionalize execution of such subcommands using the flag mechanism.
So instead of generating independent "before" and "after" statements,
transformAlterTableStmt would just produce a single AlterTable with
everything in its list of subcommands --- but we'd still use the generic
ProcessUtility infrastructure to execute subcommands that correspond
to existing standalone statements.

Looking into parse_utilcmd.c with an eye to making it do that, I almost
immediately ran across bugs we hadn't even known were there in ALTER TABLE
ADD/DROP GENERATED. These have got a different but arguably-related
flavor of bug: they are making decisions inside transformAlterTableStmt
that might be wrong by the time we get to execution. Thus for example

regression=# create table t1 (f1 int);
CREATE TABLE
regression=# alter table t1 add column f2 int not null,
alter column f2 add generated always as identity;
ALTER TABLE
regression=# insert into t1 values(0);
ERROR: no owned sequence found

This happens because transformAlterTableStmt thinks it can generate
the sequence creation commands for the AT_AddIdentity subcommand,
and also figures it's okay to just ignore the case where the column
doesn't exist. So we create the column but then we don't make the
sequence. There are similar bugs in AT_SetIdentity processing, and
I rather suspect that it's also unsafe for AT_AlterColumnType to be
looking at the column's attidentity state --- though I couldn't
demonstrate a bug in that path, because of the fact that
AT_AlterColumnType executes in a pass earlier than anything that
could change attidentity.

This can't be fixed just by conditionalizing execution of subcommands,
because we need to know the target column's type in order to set up the
sequence correctly. So what has to happen to fix these things is to
move the decisions, and the creation of the subcommand parsetrees,
into ALTER TABLE execution.

That requires pretty much the same support for recursively calling
ProcessUtility() from AlterTable() that we'd need for the subcommand
wrapper idea. So I went ahead and tackled it as a separate project,
and attached is the result.

I'm not quite sure if I'm satisfied with the approach shown here.
I made a struct containing the ProcessUtility parameters that need
to be passed down through the recursion, originally with the idea
that this struct might be completely opaque outside utility.c.
However, there's a good deal of redundancy in that approach ---
the relid and stmt parameters of AlterTable() are really redundant
with stuff in the struct. So now I'm wondering if it would be better
to merge all that stuff and just have the struct as AlterTable's sole
argument. I'm also not very sure whether AlterTableInternal() ought
to be modified so that it uses or at least creates a valid struct;
it doesn't *need* to do so today, but maybe someday it will.

And the whole thing has a faint air of grottiness about it too.
This makes the minimum changes to what we've got now, but I can't
help thinking it'd look different if we'd designed from scratch.
The interactions with event triggers seem particularly ad-hoc.
It's also ugly that CreateTable's recursion is handled differently
from AlterTable's.

Anybody have thoughts about a different way to approach it?

regards, tom lane

[1] https://postgr.es/m/7824.1525200461@sss.pgh.pa.us

Attachment Content-Type Size
alter-table-with-recursive-process-utility-calls-wip.patch text/x-diff 30.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-05-26 22:29:34 Re: Fix inconsistencies for v12
Previous Message Amit Kapila 2019-05-26 20:45:10 Re: Fix inconsistencies for v12