| From: | Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Deadlock detector fails to activate on a hot standby replica |
| Date: | 2026-01-23 11:51:49 |
| Message-ID: | b178ea8d-9ed9-48b3-b4f7-5cfc3ff6ee44@postgrespro.ru |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Dear Hackers,
I would like to propose a patch that fixes the problem, which has the roots in
the possibility of spontaneous SIGALRM signals when waiting for some timeouts.
The idea of the patch - ignore spontaneous SIGALRM signals and continue waiting
for expected timeouts or buffer unpinning by the conflicting backend. This
patch is not a final version. I plan to add a tap-test for this case.
I'm in doubt to put the calls of some page buffer specific functions into
ResolveRecoveryConflictWithBufferPin (standby.c), but otherwise we have to
do more changes in LockBufferForCleanup and ResolveRecoveryConflictWithBufferPin.
I also think, we have to add some description of the found problem in timeout.c,
because the implemented optimization of setitimer calls leads to some not
evident consequences. The optimization seems to be implemented in the commit:
09cf1d52267644cdbdb734294012cf1228745aaa
With best regards,
Vitaly
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Fix-deadlock-detector-activation-in-a-recovery-confl.patch | text/x-patch | 6.5 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Fujii Masao | 2026-01-23 12:13:58 | Is abort() still needed in WalSndShutdown()? |
| Previous Message | Kirill Reshke | 2026-01-23 11:03:50 | Re: Fix gistkillitems & add regression test to microvacuum |