From: | Anthony Hsu <erwaman(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Set 1s WaitLatch timeout if standby limit has expired in ResolveRecoveryConflictWithBufferPin |
Date: | 2025-07-06 18:55:15 |
Message-ID: | CALQc50gi-Kw9m1r6hytf12473-fCECy=q9JtKS4ANeJFEyCBTw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I think there is a race scenario where a backend holding a conflicting
buffer pin isn't promptly canceled even when the standby limit has expired:
1. suppose there is a buffer pin conflict and standby limit has already
expired
2. startup process enters ResolveRecoveryConflictWithBufferPin and
broadcasts PROCSIG_RECOVERY_CONFLICT_BUFFERPIN here [A] but does not set
any timeouts
3. startup process waits to be signaled by UnpinBuffer() here [B]
4. some non-conflicting backend receives the buffer pin signal sent in (2),
checks and sees it is not blocking recovery, and *then* acquires a
conflicting buffer pin
5. then the original conflicting backend receives the buffer pin signal
sent in (2) and cancels itself, calling UnpinBuffer(). But the pin count
will still be > 1 (due to (4) + the pin startup holds), so startup process
will not be woken up
In this scenario, the startup process might not be woken up for an
arbitrarily long length of time. And the new conflicting backend (step (4)
above) won't get sent another PROCSIG_RECOVERY_CONFLICT_BUFFERPIN signal
telling it to cancel itself.
To handle this scenario, I think we should set a timeout when doing
WaitLatch if standby limit has already expired. This allows the startup
process to wake up in a reasonable time to recheck and send
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN again to any new conflicting backends.
I have attached a small patch with this proposed fix.
Thanks,
Anthony
[A]
https://github.com/postgres/postgres/blob/21c9756db6458f859e6579a6754c78154321cb39/src/backend/storage/ipc/standby.c#L806
[B]
https://github.com/postgres/postgres/blob/21c9756db6458f859e6579a6754c78154321cb39/src/backend/storage/ipc/standby.c#L843
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Set-1s-WaitLatch-timeout-if-standby-limit-has-exp.patch | application/octet-stream | 3.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Arseniy Mukhin | 2025-07-06 18:59:39 | Re: amcheck support for BRIN indexes |
Previous Message | Tom Lane | 2025-07-06 18:26:49 | Re: A recent message added to pg_upgade |