Re: Foreign key isolation tests

From: Paul Jungwirth <pj(at)illuminatedcomputing(dot)com>
To: Rustam ALLAKOV <rustamallakov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Foreign key isolation tests
Date: 2025-07-11 22:22:01
Message-ID: 60089a00-d586-4747-90f8-99d65e19c81c@illuminatedcomputing.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/11/25 11:50, Rustam ALLAKOV wrote:
> Hi Paul,
> Thanks for this patch!
> Lilian and I have reviewed your patches.

Thank you for the review! I've attached new patches addressing your feedback.

> All patches apply and all tests are passing.
> Since you are adding tests, we expected to see some improvement on the test
> coverage. Yet no significant improvement was observed. We have tested vanila
> postgres at commit 931766aaec58b2ce09c82203456877e0b05e1271, run coverage.
> Then we have applied first patch v1-0001, tested and run coverage.
> Lastly v2-0001 was applied on top of v1-0001, tested and run coverage.
> All three coverage reports can be found here [1].
>
> We expected some increase in coverage especially in this file
> `src/backend/utils/adt/ri_triggers.c` perhaps we are looking at the wrong
> file?

A system isn't fully tested just because all lines are tested.
For instance suppose you have `int add(int x, int y) { return x + y; }`.
A test may confirm that `add(1, 1)` equals 2, and you have 100% test coverage,
but `add(INT_MAX, 1)` equaling -2147483648 is probably still a bug.
Similarly, these tests don't cover untested *lines*, but untested *scenarios*.

Since this is a tricky scenario to get correct, I think we should have a test for it.
I was asked to add these tests for temporal foreign keys, and I was surprised to find
we don't have them for regular foreign keys either, so these patches add both.

If you compile with this change, all old tests still pass, but the new tests will fail:

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 059fc5ebf60..8738b2d7e03 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -892,7 +892,7 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
fk_rel, pk_rel,
oldslot, NULL,
!is_no_action,
- true, /* must detect new rows */
+ false, /* must detect new rows */
SPI_OK_SELECT);

if (SPI_finish() != SPI_OK_FINISH)

> In addition to that, related to the comment
> 'Violates referential integrity unless we use an up-to-date crosscheck snapshot:'
> Could you please clarify:
> Why is this necessary to specify usage of an up-to-date crosscheck snapshot?
> When would the crosscheck snapshot be not up to date?
> Should we add tests for up-to-date crosscheck snapshots?

The point is that the crosscheck snapshot *is* up to date, compared to the current snapshot.
The adjective is descriptive, not limiting.
I re-worded the comment a bit to hopefully make this more clear.

I think that answers all the questions you're asking here, but to give some more context:
The crosscheck snapshot is required to prevent bad rows sneaking past the foreign key constraint.
Take the two tables from fk-snapshot-2.spec. The parent has a row with parent_id 1.
Now imagine you have two connections, both REPEATABLE READ, that do things in this order
(without a crosscheck snapshot):

connection 2: insert into child (child_id, parent_id) values (1, 1);
connection 2: -- RI triggers run, the constraint looks good.
connection 2: -- We have a FOR KEY SHARE lock on the parent row.
connection 1: delete from parent where parent_id = 1;
connection 1: -- We can't delete the row until we get our own lock,
connection 1: -- which conflicts with FOR KEY SHARE.
connection 1: -- (Since this is a tuple lock, in pg_locks you see us waiting on
connection 1: -- the transactionid of connection 2.)
connection 2: commit;
connection 1: -- We get the lock, delete the row, and run our RI check.
connection 1: -- We don't see the new insert, so the constraint looks good.
connection 1: commit;

We have violated the foreign key!

> Also in v1-0002 you only test 'REPEATABLE READ' isolation level, what about the
> other two? ('READ COMMITTED', 'SERIALIZABLE')

In READ COMMITTED, connection 1 is allowed to see the inserted row, so it correctly rejects the delete anyway.

In SERIALIZABLE, the sequence would raise a cannot-serialize exception anyway, although the cross-check
snapshot helps us give a better error message.

I added these cases to fk-snapshot-{2,3}.spec. I don't think they are as important as REPEATABLE READ,
but for something as fundamental as foreign keys it is reassuring to have them,
and I like that they document what happens.

> to test this specific feature only instead of running the whole test suit.
> These are the steps we came up with so far: (we use Macos)
>
> ./configure --prefix=$PGINSTALL/paul --enable-cassert --enable-debug \
> --enable-coverage --enable-tap-tests --without-icu --without-zstd \
> --without-zlib CFLAGS="-ggdb -O0 -fno-omit-frame-pointer" CPPFLAGS="-g -O0"
> make -j12 -s
> make -j12 install
> cd src/test/isolation
> make check TESTNAME="fk-snapshot fk-snapshot-2 fk-snapshot-3"
> make coverage-html

I have the same --enable-* options.
But does TESTNAME actually restrict your run to just those three? It doesn't for me.

As far as I know there is no way to run selected isolation tests, other than editing the isolation_schedule file.
For regress tests you can say `make check-tests TESTS=bit`. (Note the different Makefile target.)
Within src/test/subscription you can use `make check PROVE_TESTS=t/034_temporal.pl`.
Weirdly `grep -r TESTNAME src/test/isolation` does say that pg_isolation_regress matches.
Is there documentation saying that it should do something?
If there is a way to run just one isolation test, I'd love to know.

Rebased to 4df477153a.

Yours,

--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com

Attachment Content-Type Size
v2-0001-Fill-testing-gap-for-possible-referential-integri.patch text/x-patch 7.1 KB
v2-0002-Add-test-for-temporal-referential-integrity.patch text/x-patch 6.4 KB
v2-0003-Improve-comment-about-snapshot-macros.patch text/x-patch 1.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-07-12 01:18:50 Re: Can can I make an injection point wait occur no more than once?
Previous Message Melanie Plageman 2025-07-11 22:19:15 Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)