Re: Add CHECK_FOR_INTERRUPTS in polling loop code path in XactLockTableWait

From: Kevin K Biju <kevinkbiju(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add CHECK_FOR_INTERRUPTS in polling loop code path in XactLockTableWait
Date: 2025-05-28 20:40:50
Message-ID: CAM45KeEOKkfAHdo2i56S6b7mGPLpEaSzpBhGWvBAeDcOGQef_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Fujii,

The WaitEvent sounds good to me, I will submit a separate patch for that
when I find time.

Kevin

On Wed, May 28, 2025 at 5:06 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
wrote:

>
>
> On 2025/05/27 4:43, Kevin K Biju wrote:
> > Hi Fujii,
> >
> > Thanks for the review.
>
> So unless there are any objections, I'm planning to commit
> the patch with the following commit message.
>
> ---------------
> Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable
> more.
>
> Previously, XactLockTableWait() and ConditionalXactLockTableWait() could
> enter
> a non-interruptible loop when they successfully acquired a lock on a
> transaction
> but the transaction still appeared to be running. Since this loop continued
> until the transaction completed, it could result in long, uninterruptible
> waits.
>
> Although this scenario is generally unlikely since a transaction lock is
> basically acquired only when the transaction is not running, it can occur
> in a hot standby. In such cases, the transaction may still appear active
> due to
> the KnownAssignedXids list, even while no lock on the transaction exists.
> For example, this situation can happen when creating a logical replication
> slot
> on a standby.
>
> The cause of the non-interruptible loop was the absence of
> CHECK_FOR_INTERRUPTS()
> within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both
> functions,
> ensuring they can be interrupted safely.
>
> Back-patch to all supported branches.
>
> Author: Kevin K Biju <kevinkbiju(at)gmail(dot)com>
> Reviewed-by: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
> Discussion:
> https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
> Backpatch-through: 13
> ---------------
>
>
> > > Just an idea: how about calling pgstat_report_wait_start() and _end()
> around pg_usleep(), similar to what WaitExceedsMaxStandbyDelay() does?
> >
> > I was thinking about this as well, but the question is what do we report
> the wait event here as? The one that makes sense to me is PG_WAIT_LOCK, but
> that wouldn't align with what pg_locks would display since there is no
> actual lock grabbed on the standby.
>
> I couldn't find an existing wait event that fits this case, so how about
> adding a new one, like IPC/XactDone, to indicate "Waiting for the
> transaction
> to commit or abort"?
>
>
> > > Also, would waiting only 1ms per loop cycle be too aggressive?
> >
> > I agree that it's aggressive for this case. But
> XactLockTableWait/ConditionalXactLockTableWait seem to be used in other
> places including in heapam so I'm hesitant on changing this behaviour for
> all of them.
> I agree that we need more time to consider whether this change is safe.
>
> Regards,
>
> --
> Fujii Masao
> NTT DATA Japan Corporation
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-05-28 20:56:32 Re: Fixing memory leaks in postgres_fdw
Previous Message Joe Conway 2025-05-28 18:54:29 Re: PG 18 release notes draft committed