Re: Add CHECK_FOR_INTERRUPTS in polling loop code path in XactLockTableWait

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kevin K Biju <kevinkbiju(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add CHECK_FOR_INTERRUPTS in polling loop code path in XactLockTableWait
Date: 2025-05-28 11:36:32
Message-ID: 6d9cb2f7-6273-4be6-959e-e315917550ac@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2025/05/27 4:43, Kevin K Biju wrote:
> Hi Fujii,
>
> Thanks for the review.

So unless there are any objections, I'm planning to commit
the patch with the following commit message.

---------------
Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more.

Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter
a non-interruptible loop when they successfully acquired a lock on a transaction
but the transaction still appeared to be running. Since this loop continued
until the transaction completed, it could result in long, uninterruptible waits.

Although this scenario is generally unlikely since a transaction lock is
basically acquired only when the transaction is not running, it can occur
in a hot standby. In such cases, the transaction may still appear active due to
the KnownAssignedXids list, even while no lock on the transaction exists.
For example, this situation can happen when creating a logical replication slot
on a standby.

The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS()
within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions,
ensuring they can be interrupted safely.

Back-patch to all supported branches.

Author: Kevin K Biju <kevinkbiju(at)gmail(dot)com>
Reviewed-by: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
Backpatch-through: 13
---------------

> > Just an idea: how about calling pgstat_report_wait_start() and _end() around pg_usleep(), similar to what WaitExceedsMaxStandbyDelay() does?
>
> I was thinking about this as well, but the question is what do we report the wait event here as? The one that makes sense to me is PG_WAIT_LOCK, but that wouldn't align with what pg_locks would display since there is no actual lock grabbed on the standby.

I couldn't find an existing wait event that fits this case, so how about
adding a new one, like IPC/XactDone, to indicate "Waiting for the transaction
to commit or abort"?

> > Also, would waiting only 1ms per loop cycle be too aggressive?
>
> I agree that it's aggressive for this case. But XactLockTableWait/ConditionalXactLockTableWait seem to be used in other places including in heapam so I'm hesitant on changing this behaviour for all of them.
I agree that we need more time to consider whether this change is safe.

Regards,

--
Fujii Masao
NTT DATA Japan Corporation

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-05-28 11:59:29 Re: ALTER TABLE ALTER CONSTRAINT misleading error message
Previous Message Tender Wang 2025-05-28 11:30:09 Re: Standardize the definition of the subtype field of AlterDomainStmt