From: | Marco Boeringa <marco(at)boeringa(dot)demon(dot)nl> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org, Thom Brown <thom(at)linux(dot)com> |
Subject: | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Date: | 2025-10-07 19:34:00 |
Message-ID: | 1c76a5f4-1b9f-46b2-b49b-2ff2b285fb99@boeringa.demon.nl |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hi Andres,
I have now been able to successfully run 'perf' after a VM restart and
booting with the linux 6.8 kernel.
I have copied in two results for 'perf top' and a recording of one of
the about 10 stuck processes by supplying a specific pid of the postgres
backend.
Let me know if this is of any use. Of course, the recordings themselves
are much bigger, I just copied the top part as displayed by the 'perf'
report option. If you need the actual underlying files, let me know.
Also, if you have specific suggestions for 'perf' command options you
would like to see run, please post them in your response and I will try
if I can run them.
Andres: I posted separately to you I had a suspicion a recent linux
kernel update (6.8 --> 6.14) might be involved that coincided with the
PG18/PostGIS 3.6.0 upgrade when I initially had multiple successful runs
of my tool without the temporary stalling / delay in returning of
postgres after booting to the older 6.8 kernel. This has now proven
wrong: these results are of the 6.8 kernel runs, so the kernel is not
involved, and I also see the issue with the older kernel.
*** sudo perf top: ***
Samples: 1M of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.):
346785156250 lost: 0/0 drop: 0/0
Overhead Shared Object Symbol
17,33% postgres [.] 0x0000000000584216
12,21% postgres [.] 0x000000000058737a
11,80% [kernel] [k] pv_native_safe_halt
10,24% postgres [.] ReleaseAndReadBuffer
4,92% postgres [.] heap_hot_search_buffer
3,63% postgres [.]
hash_search_with_hash_value
2,36% postgres [.] 0x00000000005841f3
1,85% postgres [.] 0x0000000000554edd
1,61% postgres [.] 0x0000000000554e1f
1,40% postgres [.] 0x00000000005567f9
1,32% postgres [.] StartReadBuffer
1,24% postgres [.] heap_page_prune_opt
1,23% postgres [.] LWLockAcquire
1,14% postgres [.]
HeapTupleSatisfiesVisibility
1,10% postgres [.] 0x00000000003977dd
0,92% postgres [.] index_getnext_tid
0,87% postgres [.] index_fetch_heap
0,83% postgres [.] ExecScan
0,82% postgres [.] ReleaseBuffer
0,81% postgres [.] LWLockRelease
0,71% postgres [.] XidInMVCCSnapshot
0,66% postgres [.] _bt_checkkeys
0,66% postgres [.] _bt_next
0,55% libxorgxrdp.so [.] crc_process_data
0,51% postgres [.] btgettuple
0,41% postgres [.] 0x00000000005567c6
0,40% postgres [.] 0x00000000003975d0
0,35% postgres [.]
ExecStoreBufferHeapTuple
0,34% postgres [.] hash_bytes
0,33% postgres [.] MemoryContextReset
0,33% postgres [.] 0x00000000003975d3
0,32% postgres [.] ExecEvalSysVar
0,28% postgres [.] 0x00000000003975e3
0,27% postgres [.] ResourceOwnerForget
0,27% postgres [.] GlobalVisTestFor
0,26% postgres [.] 0x00000000003975b6
0,26% postgres [.] HeapTupleIsSurelyDead
0,23% postgres [.] 0x00000000003975c3
0,23% postgres [.] 0x00000000001ff80a
0,22% postgres [.] PredicateLockTID
0,21% postgres [.] slot_getsomeattrs_int
0,20% postgres [.] 0x000000000058421d
0,19% postgres [.] ReadBufferExtended
0,16% libc.so.6 [.] __memcmp_sse2
0,15% postgres [.] 0x0000000000382e9e
*** sudo perf -p <PID of one stuck postgres backend> -g -d 10 ***
*** sudo perf report -g ***
Samples: 40K of event 'task-clock:ppp', Event count (approx.): 10018000000
Children Self Command Shared Object Symbol
+ 100,00% 0,00% postgres postgres [.] _start
+ 100,00% 0,00% postgres libc.so.6 [.]
__libc_start_main@@GLIBC_2.34
+ 100,00% 0,00% postgres libc.so.6 [.]
__libc_start_call_main
+ 100,00% 0,00% postgres postgres [.] main
+ 100,00% 0,00% postgres postgres [.] PostmasterMain
+ 100,00% 0,00% postgres postgres [.] 0x00005f35711b913e
+ 100,00% 0,00% postgres postgres [.]
postmaster_child_launch
+ 100,00% 0,00% postgres postgres [.] 0x00005f3570fb9dbf
+ 100,00% 0,00% postgres postgres [.] PostgresMain
+ 100,00% 0,00% postgres postgres [.] 0x00005f3570fb6cb9
+ 100,00% 0,00% postgres postgres [.] PortalRun
+ 100,00% 0,00% postgres postgres [.] 0x00005f3570fbeca3
+ 100,00% 0,00% postgres postgres [.] 0x00005f3570fbcf85
+ 100,00% 0,00% postgres postgres [.] standard_ExecutorRun
+ 100,00% 0,00% postgres postgres [.] 0x00005f3570de682b
+ 100,00% 0,00% postgres postgres [.] 0x00005f3570de966d
+ 97,51% 0,00% postgres postgres [.] 0x00005f3570de959d
+ 97,17% 0,95% postgres postgres [.] ExecScan
+ 82,75% 0,00% postgres postgres [.] 0x00005f3570dda135
+ 82,35% 1,04% postgres postgres [.] index_fetch_heap
+ 28,80% 0,00% postgres postgres [.] 0x00005f3570bf3f87
+ 23,67% 0,00% postgres postgres [.] 0x00005f3570bf3ee8
+ 22,19% 0,00% postgres postgres [.] 0x00005f3570fa1216
+ 22,18% 22,18% postgres postgres [.] 0x0000000000584216
+ 17,06% 0,89% postgres postgres [.] LWLockRelease
+ 16,18% 0,00% postgres postgres [.] 0x00005f3570bf3f4f
+ 15,59% 0,00% postgres postgres [.] 0x00005f3570fa437a
+ 15,57% 15,57% postgres postgres [.] 0x000000000058737a
+ 14,90% 0,22% postgres postgres [.] ReadBufferExtended
+ 14,18% 13,05% postgres postgres [.] ReleaseAndReadBuffer
+ 14,00% 1,82% postgres postgres [.] StartReadBuffer
+ 6,59% 0,00% postgres postgres [.] 0x00005f3570bf3f18
+ 5,97% 3,78% postgres postgres [.]
heap_hot_search_buffer
+ 5,23% 0,00% postgres postgres [.] 0x00005f3570da0613
+ 5,20% 0,00% postgres postgres [.] 0x00005f3570dda121
+ 5,19% 0,22% postgres postgres [.] slot_getsomeattrs_int
+ 5,18% 0,68% postgres postgres [.] index_getnext_tid
+ 4,52% 0,65% postgres postgres [.] btgettuple
+ 3,92% 0,99% postgres postgres [.] _bt_next
+ 3,28% 3,28% postgres postgres [.]
hash_search_with_hash_value
+ 2,96% 0,38% postgres postgres [.]
ExecStoreBufferHeapTuple
+ 2,93% 0,00% postgres postgres [.] 0x00005f3570bf3fe0
From | Date | Subject | |
---|---|---|---|
Next Message | Marco Boeringa | 2025-10-07 19:49:14 | Re: Potential "AIO / io workers" inter-worker locking issue in PG18? |
Previous Message | Masahiko Sawada | 2025-10-07 17:25:29 | Re: TRAP: failed Assert("outerPlan != NULL") in postgres_fdw.c |