Re: Deferred constraint trigger semantics

From: Bryn Llewellyn <bryn(at)yugabyte(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: pgsql-general list <pgsql-general(at)lists(dot)postgresql(dot)org>, david(dot)g(dot)johnston(at)gmail(dot)com
Subject: Re: Deferred constraint trigger semantics
Date: 2022-05-16 19:03:12
Message-ID: 22A28644-9544-40E8-8150-25DA4E50DCEB@yugabyte.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> laurenz(dot)albe(at)cybertec(dot)atwrote:
>
>> bryn(at)yugabyte(dot)com wrote:
>
>>
>> …I tried this:
>>
>> create table t(
>> k serial primary key,
>> v int not null,
>> constraint t_v_unq unique(v) initially deferred);

Here's a better test:

-- BLUE session
start transaction isolation level read committed;
insert into t(v) values (1), (2);

-- RED session
start transaction isolation level read committed;
insert into t(v) values (1), (3);

-- BLUE session
set constraints all immediate;

-- RED session (hangs until BLUE commits).
-- Then, when it does, gets ERROR... "Key (v)=(1) already exists"
set constraints all immediate;

-- BLUE session
commit;

-- RED session
-- There are no changes to commit 'cos they were rolled back.
commit;

-- "select k, v from t order by k" (in each session) now shows that both sessions meet the constraint.

>> Where, in the PG doc, can I read the account of the proper mental model for the application programmer?
>
> [See https://www.postgresql.org/docs/current/index-unique-checks.html.]

Thanks for referring me to the account "62.5. Index Uniqueness Checks". It's in the section "Part VII. Internals" (…contains assorted information that might be of use to PostgreSQL developers). I wouldn't expect to read this because I don't intend to write code that might become part of PG's implementation.

> I'd say that the proper mental model is that you don't need to care… The exact sequence of what happens during COMMIT is interesting, but irrelevant to the programmer. All that counts is "a deferred constraint is checked between the time that COMMIT starts processing and the time that it returns".

Yes, I very much like this stance. It seems that, for built-in constraints (like "unique" or "foreign key") it's enough to understand that PG implements these at the "read committed" isolation level by using methods (that aren't exposed via SQL) to peep below the application programmer's MVCC view of the world to check the uncommitted state of other, concurrent, sessions.

This explains why, in the (new) test that I used above, the conflict is detected when the second session issues "set constraints all immediate" after the first already did this (i.e. long before COMMIT). In this case, the second session hangs until the first commits—at which point the second sees the uniqueness violation error.

In other words, the automagic implementation of the enforcement of built-in constraints allows the safe use of "set constraints all immediate" to provoke a possible early error that can, very usefully, be handled in PL/pgSQL code. This is the clue to understanding why the check of a built-in constraint, when it's performed as an implicit consequence of "commit", doesn't need to be within the small part of the operations that "commit" causes that are strictly serialized. (You've explained how this helps performance in multi-session scenarios.)

Critically, the special methods that implement the enforcement of built-in constraints aren't accessible in PL/pgSQL code and therefore not accessible in the "ordinary" implementation of trigger functions. This is the point that I failed to grasp. (Though I do see, now, that Laurenz's post says this clearly.)

I was able to demonstrate this by implementing a unique constraint with a deferred constraint trigger (and no use of "set constraints all immediate"). I simply introduced "pg_sleep(5)" between the trigger function's actual check and its final "return null". I copied the code below for completeness.

The "BLUE" session, because it reaches its serialized "commit" actions first, sees an outcome that meets the constraint. But the "RED" session has enough time to do its check before "BLUE" does its serialized "commit" actions. So its test passes too. This leaves the final database in conflict with the intended constraint.

I see now that the only robust use of an ordinarily (i.e. not using C) implemented constraint trigger (deferred or otherwise) is to enforce a single row-constraint. (There's a caveat that maybe, after careful analysis, you can work out a cunning lockings scheme to allow the safe implementation of an entity-level constraint without using C. But the "exactly one or two admins in a department" example shows that this isn't generally possible.) So it's reasonable that a constraint trigger must be AFTER EACH ROW. Further, it would make no sense to do SQL from its implementation function because the only values that you might defensibly use are available simply via "old" and "new".

So all that stuff I was concerned about where the deferred constraint fires many times when once is enough falls away because the larger endeavor makes no sense. (I suppose that it might matter if you implemented the trigger function robustly in C.)

It does strike me that the ordinary application programmer—who reads just the sections "CREATE TRIGGER", "Chapter 39. Triggers", "CREATE TABLE", and "SET CONSTRAINTS"—will have a hard time to reach a correct understanding of what I believe that, with huge help from Laurenz Albe and David Johnston, I hope that I now have. This is the key sentence that needs careful interpretation:

"If the constraint is INITIALLY DEFERRED, it is checked only at the end of the transaction."

One's understanding is crucially determined by realizing that "at the end of the transaction" means "concurrently, when two or more sessions hit COMMIT at the exact same moment—and therefore *before* that part of the commit actions that is serialized. Oh well, I won't push that point.

--------------------------------------------------------------------------------
-- set-up.sql
drop table if exists t;
create table t(
k serial primary key,
v int not null);

-- No need, in this simple demo, to use a latching scheme
-- to execute constraint_trg_fn()'s test only once.
drop function if exists constraint_trg_fn() cascade;
create function constraint_trg_fn()
returns trigger
language plpgsql
as $body$
begin
set constraint_trigger.fired = 'true';
if
(
(select count(*) from t) > (select count(distinct v) from t)
)
then
raise exception using
errcode = 'raise_exception',
message = '"t.v" values must be unique',
hint = 'try again';
end if;
perform pg_sleep(5);
return null;
end;
$body$;

create constraint trigger constraint_trg
after insert on t
initially deferred
for each row
execute function constraint_trg_fn();
--------------------------------------------------------------------------------
-- blue.sql

set default_transaction_isolation = 'read committed';
do $body$
begin
insert into t(v) values (1), (2);
end;
$body$;
select k, v from t order by k;

--------------------------------------------------------------------------------
-- red.sql

set default_transaction_isolation = 'read committed';
do $body$
begin
insert into t(v) values (1), (3);
end;
$body$;
select k, v from t order by k

--------------------------------------------------------------------------------
-- RED session

\i set-up.sql

--------------------------------------------------------------------------------
-- BLUE session

\i blue.sql
--------------------------------------------------------------------------------
-- RED session
-- to be done as soon as your reflexes allow after starting "red.sql"

\i red.sql

--------------------------------------------------------------------------------

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bryn Llewellyn 2022-05-16 21:04:05 Re: Restricting user to see schema structure
Previous Message Zheng Li 2022-05-13 21:01:33 Re: Support logical replication of DDLs