Re: PG19 FK fast path: OOB write and missed FK checks during batched

From: Nikolay Samokhvalov <nik(at)postgres(dot)ai>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Andrey Borodin <amborodin(at)acm(dot)org>, Kirk Wolak <wolakk(at)gmail(dot)com>
Subject: Re: PG19 FK fast path: OOB write and missed FK checks during batched
Date: 2026-06-10 08:16:16
Message-ID: CAM527d_2OpJ3KCOT1QqGh4neCPpgZTgM+VUxTqVgOSweOzTDQw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 9, 2026 at 6:31 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:

> On Mon, Jun 8, 2026 at 5:18 PM Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> > On Sat, Jun 6, 2026 at 6:13 PM Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> > > Thanks for the detailed report and reproducers. I’ve started looking
> into this.
> >
> > Continuing to look. Appended this to the open items list:
> >
> > https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items#Open_Issues
>
> Thanks again, Nik, for the thorough analysis and the reproducers --
> they made all three easy to confirm and pin down. Patches attached:
> 0001 for defect 1, 0002 for defects 2 and 3.
>
> 0001 (defect 1): check and flush before writing the row rather than
> after, and add a per-entry "flushing" flag so a re-entrant add on the
> same entry during a flush takes the per-row path instead of touching
> the mid-flush batch. The flag is cleared in a PG_FINALLY, which also
> resets batch_count, so the entry stays reusable if a flush error is
> caught by a savepoint.
>
> 0002 (defects 2 and 3): rather than track subxact membership per row,
> confine batching to the top transaction level -- in RI_FKey_check,
> when GetCurrentTransactionNestLevel() > 1, use the per-row path. I
> went this way because per-entry subxact tracking isn't enough (one
> entry's batch can mix rows from several levels, since the cache is
> keyed by constraint), and flushing at subxact boundaries doesn't work
> for deferred constraints. Once the cache only ever holds top-level
> rows, a subxact abort has nothing of its own to discard, so
> ri_FastPathSubXactCallback goes away -- that's what fixes your defect
> 2 reproducer. For defect 3, which is still reachable at the top level,
> the same patch adds a cache-wide flag set while ri_FastPathEndBatch
> iterates, so a re-entrant check during the scan takes the per-row path
> instead of inserting into the cache being scanned.
>
> The per-row path still bypasses SPI, so these stay well ahead of the
> pre-19 check in terms of performance. I'd like to recover batching
> across subtransactions properly in v20 but didn't want to rush it now.
>
> On defect 3, can you check whether your reproducer still commits the
> orphan with 0002 applied, or whether (like on my build) it now raises
> the violation? I'd like to be sure the bucket-placement variation you
> hit is actually covered. And of course any review of the patches is
> welcome.
>
> --
> Thanks, Amit Langote
>

Hi Amit,

Thanks for the quick fixes.

I checked v1-0001 + v1-0002 against current master (e18b0cb7) with an
assertion/debug build.

- Both apply cleanly to master (in sequence)
- Defect 1 same-FK re-entry no longer crashes; the original shape completes
and leaves the expected rows
- Defect 2 subtransaction-abort case now raises the FK violation instead of
committing orphans
- For your defect 3 question: with 0002 applied, my reproducer no longer
commits the child2 orphan. It raises:
ERROR: insert or update on table "child2" violates foreign key
constraint "child2_fkey"
DETAIL: Key (a)=(999999) is not present in table "parent".

After the error, child2_orphans = 0 and child2 is empty in my run.

I also ran the regression suite in that tree; foreign_key passed, and the
full run reported all 245 tests passed.

So v1 looks good to me for the three reported cases.

Thanks!

Nik

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2026-06-10 08:32:27 Re: PG19 FK fast path: OOB write and missed FK checks during batched
Previous Message Michael Paquier 2026-06-10 08:05:01 Re: [PATCH] Fix typos in pqsignal.c comment