Re: Unnecessary checks for new rows by some RI trigger functions?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unnecessary checks for new rows by some RI trigger functions?
Date: 2019-02-21 17:02:25
Message-ID: 25176.1550768545@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Feb 20, 2019 at 9:27 AM Antonin Houska <ah(at)cybertec(dot)at> wrote:
>> However I find it confusing that the trigger functions pass detectNewRows=true
>> even if they only execute SELECT statement.

> I don't quite see what those two things have to do with each other,
> but I might be missing something.

Me too. It's quite easy to demonstrate that the second part of Antonin's
proposed patch (ie, don't check for new rows in the FK table during
ri_restrict) is wrong:

regression=# create table p(f1 int primary key);
CREATE TABLE
regression=# create table f(f1 int references p on delete no action deferrable initially deferred);
CREATE TABLE
regression=# insert into p values (1);
INSERT 0 1
regression=# begin transaction isolation level serializable ;
BEGIN
regression=# table f;
f1
----
(0 rows)

regression=# delete from p where f1=1;
DELETE 1

-- now, in another session:

regression=# insert into f values (1);
INSERT 0 1

-- next, back in first session:

regression=# commit;
ERROR: update or delete on table "p" violates foreign key constraint "f_f1_fkey" on table "f"
DETAIL: Key (f1)=(1) is still referenced from table "f".

With the patch, the first session's commit succeeds, and we're left
with a violated FK constraint.

I'm less sure about whether the change in ri_Check_Pk_Match would be
safe. On its face, what that is on about is the case where you have
matching rows in p and f, and a serializable transaction tries to
delete the p row, and by the time it commits some other transaction
has inserted a new p row so we could allow our deletion to succeed.
If you try that, however, you'll notice that the other transaction
can't commit its insertion because we haven't committed our deletion,
so the unique constraint on the PK side would be violated. So maybe
there's a case for simplifying the logic there. Or maybe there are
actually live cases for that with more complex MATCH rules than what
I tried.

But TBH I think Antonin's question is backwards: the right thing to
be questioning here is whether detectNewRows = false is *ever* OK.
I think he mischaracterized what that option really does; the comments
in ri_PerformCheck explain it this way:

* In READ COMMITTED mode, we just need to use an up-to-date regular
* snapshot, and we will see all rows that could be interesting. But in
* transaction-snapshot mode, we can't change the transaction snapshot. If
* the caller passes detectNewRows == false then it's okay to do the query
* with the transaction snapshot; otherwise we use a current snapshot, and
* tell the executor to error out if it finds any rows under the current
* snapshot that wouldn't be visible per the transaction snapshot.

The question therefore is under what scenarios is it okay for an FK
constraint to be enforced (in a serializable or repeatable-read
transaction) without attention to operations already committed by
concurrent transactions? If your gut answer isn't "damn near none",
I don't think you're being paranoid enough ;-)

The one case where we use detectNewRows == false today is in
RI_FKey_check, which we run after an insert or update on the FK
table to see if there's a matching row on the PK side. In this
case, failing to observe a newly-added PK row won't result in a
constraint violation, just an "unnecessary" error. I think this
choice is reasonable, since if you're running in serializable
mode you're not supposed to want to depend on transactions
committed after you start. (Note that if somebody concurrently
*deletes* the PK row we need to match, we will notice that,
because the SELECT FOR SHARE will, and then it'll throw a
serialization error which seems like the right thing here.)

Perhaps a similar argument can be constructed to justify changing
the behavior of ri_Check_Pk_Match, but I've not thought about it
very hard. In any case, changing ri_restrict is provably wrong.

> Unfortunately, as you probably also noticed, making the change
> you propose here doesn't make any tests fail in either the main
> regression tests or the isolation tests.

Yeah. All the RI code predates the existence of the isolation
test machinery, so it's not so surprising that we don't have good
coverage for this type of situation. I thought I remembered that
Peter had added some more FK tests recently, but I don't see
anything in the commit log ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-02-21 17:08:39 Re: ATTACH/DETACH PARTITION CONCURRENTLY
Previous Message Gilles Darold 2019-02-21 16:42:03 Re: [patch] Add schema total size to psql \dn+