From: | Olga Antonova <o(dot)antonova(at)postgrespro(dot)ru> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Cc: | Sergey Shinderuk <s(dot)shinderuk(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Read-Write optimistic lock (Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum) |
Date: | 2025-08-22 07:40:49 |
Message-ID: | db4aca5c-c22b-4eb5-850d-212768f4fcac@postgrespro.ru |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 7/16/25 18:54, Andres Freund wrote:
> That was not in reply to the changed patch, but about the performance numbers
> you relayed. We had no repro, and even with the repro that Sergey has now
> delivered, we don't see similar levels of what you reported as contention.
We investigated this issue in detail and were able to reproduce the
spinlock contention in SIGetDataEntries. The problem is most evident on
multiprocessor systems with multiple NUMA nodes, but it also occurs on a
single node, albeit less pronounced. This is probably also the case for
high-frequency CPU.
We ran tests on two bare-metal servers:
4 NUMA nodes × 24 CPUs Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz.
PostgreSQL was running on 3 nodes (72 CPUs).
2 NUMA nodes × 32 CPUs Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz.
PostgreSQL was running on a single node (32 CPUs).
and two PostgreSQL builds: from master branch and the with the patch
v5-0001-Read-Write-optimistic-spin-lock.patch.
To generate frequent cache invalidations, we executed a background
workload that repeatedly created and dropped temporary tables with
indexes in a loop.
do $$
begin
for i in 1..1000000 loop
create temp table tt1 (
f0 bigserial primary key,
f1 int,
f2 int,
f3 int,
f4 int,
f5 int,
f6 int,
f7 int,
f8 int,
f9 int,
f10 int);
CREATE INDEX ON tt1(f1);
CREATE INDEX ON tt1(f2);
CREATE INDEX ON tt1(f3);
CREATE INDEX ON tt1(f4);
CREATE INDEX ON tt1(f5);
CREATE INDEX ON tt1(f6);
CREATE INDEX ON tt1(f7);
CREATE INDEX ON tt1(f8);
CREATE INDEX ON tt1(f9);
CREATE INDEX ON tt1(f10);
drop table tt1;
commit;
end loop;
end;
$$;
As a benchmark, we used a pgbench select-only scenario with 64 clients:
pgbench -U postgres -c 64 -j 32 -T 200 -s 100 -M prepared -b select-only
postgres -n
For convenience, the test is included as test.sh (attached), with
description and setup instructions provided in the README.
During the test, we ran perf for 10 seconds using the command
perf record -F 99 -a -g --call-graph=dwarf -o perf_data sleep 10.
Аnd then generated flame graphs from the collected data
1. Three NUMA nodes (72 CPUs)
According to the flame graph (fg_3numa_nopatch.xml), about 34% of
exec_bind_message is spent in SIGetDataEntries, >90% of which is
spinlock wait (see fg_3numa_nopatch.xml).
With the patch the share of SIGetDataEntries decreases to ~6.6%, the
main waiting shifts to LWLockAcquire, and RWOptSpinReadStart accounts
for only ~1.1% (fg_3numa_patch.xml). TPS improvement: +6–8% (over 5 runs).
Without patch: TPS = 731171.336542
With patch: TPS = 786077.155196
2. Single NUMA node (32 CPUs)
In this case the problem is less pronounced, but still SIGetDataEntries
takes 10.1% of exec_bind_message, of which 82.3% is spinlock wait
(fg_1numa_nopatch.xml).
With the patch we observed a stable 1.5–2% TPS increase (5 runs).
Without patch: TPS = 518941.051825
With patch: TPS = 528768.641836
The flame graph does not show absolute time, but the relative
distribution confirms contention on the spinlock in SIGetDataEntries.
The problem exists and is a bottleneck under high load, especially on
multiprocessor NUMA systems. The patch mitigates this contention and
improves performance.
---
Best regards,
Olga Antonova
Attachment | Content-Type | Size |
---|---|---|
test.sh | application/x-shellscript | 1.3 KB |
README | text/plain | 943 bytes |
fg_3numa_patch.xml | text/xml | 651.1 KB |
fg_3numa_nopatch.xml | text/xml | 662.6 KB |
fg_1numa_nopatch.xml | text/xml | 718.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Dominique Devienne | 2025-08-22 07:52:06 | Re: Identifying function-lookup failures due to argument name mismatches |
Previous Message | Peter Smith | 2025-08-22 07:10:32 | Re: Add support for specifying tables in pg_createsubscriber. |