Re: Eliminating SPI / SQL from some RI triggers - take 3

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3
Date: 2026-03-01 12:22:49
Message-ID: 43b56e3b-80d6-4a45-a313-cb1caa587b22@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/28/26 08:08, Amit Langote wrote:
> Hi Junwang,
>
> ...
>
> Tomas Vondra also tested with an I/O-intensive workload (dataset
> larger than shared_buffers, combined with his and Peter Geoghegan's
> I/O prefetching patches) and confirmed that the batching + SAOP
> approach helps there too, not just in the CPU-bound / memory-resident
> case. In fact he showed that the patches here don't make a big dent
> when the main bottleneck is I/O as shown in numbers that he shared in
> an off-list email:
>
> master: 161617 ms
> ri-check (0001..0004): 149446 ms (1.08x)
> ri-check + i/o prefetching: 50885 ms (3.2x)
>
> So the RI patches alone only give ~8% here since most time is waiting
> on reads. But the batching gives the prefetch machinery a window of
> upcoming probes to issue readahead against, so the two together yield
> 3.2x.
>

I tested this (with the index prefetching v11 patch), because I wanted
to check if the revised API works fine for other use cases, not just the
regular index scans. Turns out the answer is "yes", the necessary tweaks
to the FK batching patch were pretty minimal, and at the same time it
did help quite a bit for cases bottle-necked on I/O.

FWIW I wonder how difficult would it be to do something like this for
inserts into indexes. It's an orthogonal issue to FK checks (especially
for the CPU-bound cases this thread focuses on), but it's a bit similar
to the I/O-bound case. In fact, I now realize I actually did a PoC for
that in 2023-11 [1], but it went stale ...

benchmarks
----------

Anyway, thinking about the CPU-bound case, I decided to do a bit of
testing on my own. I was wondering about three things:

(a) how does the improvement depend on data distribution
(b) could it cause regressions for small inserts
(c) how sensitive is the batch size

So I devised two simple benchmarks:

1) run-pattern.sh - Inserts batches of values into a table, both the
batch and table can be either random or sequential. It's either 100k or
1M rows, logged or unlogged, etc.

2) run-pgbench.sh - Runs short pgbench inserting data into a table,
similar to (1), but with very few rows - so the timing approach is not
suitable to measure this.

Both scripts run against master, and then patched branch with three
batch sizes (default 64, 16 and 256).

results
-------

The results are very positive - see the attached PDF files comparing the
patched builds to master.

I have not found a single case where the batching causes regressions.
This surprised me a bit, I've expected small regressions for single-row
inserts in the pgbench test, but even that shows a small (~5%) gain.
Even just 2-row inserts show +25% improvement in pgbench throughput.

There are a couple cases where it matches master, I assume that's for
I/O bound cases where the CPU optimizations do not really matter. That's
expected, of course.

I don't see much sensitivity on the batch size. The 256 batches seem to
be a bit slower, but there's little difference between 16 and 64. So I'd
say 64 seems reasonable.

Overall, I think these results looks quite good. I haven't looked at the
code very closely, not beyond adjusting it to work with index prefetch.

[1] https://commitfest.postgresql.org/patch/4622/

--
Tomas Vondra

Attachment Content-Type Size
fk-batch-insert.csv text/csv 43.6 KB
fk-batch-pgbench.csv text/csv 71.1 KB
run-pattern.sh application/x-shellscript 2.6 KB
run-pgbench.sh application/x-shellscript 2.4 KB
fk-batching-patterns.pdf application/pdf 57.1 KB
fk-batching-pgbench.pdf application/pdf 66.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2026-03-01 12:24:43 Re: Skipping schema changes in publication
Previous Message Amit Kapila 2026-03-01 12:22:42 Re: Skipping schema changes in publication