| From: | Tomas Vondra <tomas(at)vondra(dot)me> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Alexey Makhmutov <a(dot)makhmutov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Adding basic NUMA awareness |
| Date: | 2026-01-13 00:51:09 |
| Message-ID: | 06c7e398-3336-49f2-a011-cc89ee28b9c0@vondra.me |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 1/13/26 01:10, Andres Freund wrote:
> Hi,
>
> On 2026-01-13 00:58:49 +0100, Tomas Vondra wrote:
>> On 1/10/26 02:42, Andres Freund wrote:
>>> psql -Xq -c 'SELECT pg_buffercache_evict_all();' -c 'SELECT numa_node, sum(size) FROM pg_shmem_allocations_numa GROUP BY 1;' && perf stat --per-socket -M memory_bandwidth_read,memory_bandwidth_write -a psql -c 'SELECT sum(abalance) FROM pgbench_accounts;'
>
>> And then I initialized pgbench with scale that is much larger than
>> shared buffers, but fits into RAM. So cached, but definitely > NB/4. And
>> then I ran
>>
>> select * from pgbench_accounts offset 1000000000;
>>
>> which does a sequential scan with the circular buffer you mention abobe
>
> Did you try it with the query I suggested? One plausible reason why you did
> not see an effect with your query is that with a huge offset you actually
> never deform the tuple, which is an important and rather latency sensitive
> path.
>
I did try with the agg query too, and there's still no difference on
either machine.
I can't do the perf on the Azure VM, because the Ubuntu is image is
borked and does not allow installing the package. But on my xeon I can
do the perf, and that gives me this:
numactl --membind=0 --cpunodebind=0 ~/builds/master-test/bin/pg_ctl
-----------------------------------------------------------------------
S0 1 24,677,226 UNC_M_CAS_COUNT.WR # 79.0 MB/s ... idth_write
S0 1 20,001,829,522 ns duration_time ...
S0 1 972,631,426 UNC_M_CAS_COUNT.RD # 3112.2 MB/s ... idth_read
S0 1 20,001,822,807 ns duration_time ...
S1 1 15,602,233 UNC_M_CAS_COUNT.WR # 49.9 MB/s ... idth_write
S1 1 712,431,146 UNC_M_CAS_COUNT.RD # 2279.6 MB/s ... idth_read
numactl --membind=0 --cpunodebind=1 ~/builds/master-test/bin/pg_ctl
-----------------------------------------------------------------------
S0 1 47,931,019 UNC_M_CAS_COUNT.WR # 153.4 MB/s ... idth_write
S0 1 20,002,933,380 ns duration_time ...
S0 1 1,007,386,994 UNC_M_CAS_COUNT.RD # 3223.2 MB/s ... idth_read
S0 1 20,002,927,341 ns duration_time ...
S1 1 10,310,201 UNC_M_CAS_COUNT.WR # 33.0 MB/s ... idth_write
S1 1 714,826,668 UNC_M_CAS_COUNT.RD # 2287.2 MB/s ... idth_read
so there is a little bit of a difference for some stats, but not much.
FWIW this is from
perf stat --per-socket -M memory_bandwidth_read,memory_bandwidth_write
-a -- sleep 20
while the agg query runs in a loop.
cheers
--
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andres Freund | 2026-01-13 01:08:31 | Re: Adding basic NUMA awareness |
| Previous Message | Andres Freund | 2026-01-13 00:33:56 | Re: Buffer locking is special (hints, checksums, AIO writes) |