RE: Fix stats reporting delays in logical parallel apply worker

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Fix stats reporting delays in logical parallel apply worker
Date: 2026-04-17 09:20:15
Message-ID: TYRPR01MB141955449A42C05BAC4F35BF594202@TYRPR01MB14195.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, April 17, 2026 3:41 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
> > On Apr 17, 2026, at 11:35, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
> wrote:
> >
> > On Friday, April 17, 2026 11:01 AM Zhijie Hou (Fujitsu)
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >> Hi,
> >>
> >> When implementing another feature, I noticed that parallel apply workers
> >> currently do not report statistics while idle in their main loop. This can
> cause
> >> stats from the last processed transaction to be arbitrarily delayed,
> especially
> >> when there are long gaps between streamed transactions.
> >>
> >> The issue is demonstrated in 0002, where a TAP test fails when attempting
> to
> >> collect stats from a parallel apply worker that has no subsequent
> transaction
> >> to
> >> trigger a stats report.
> >>
> >> 0001 fixes this issue by forcing a stats report when the worker is idle in the
> >> main loop, matching the behavior already present in
> LogicalRepApplyLoop()
> >> for
> >> regular logical apply workers.
> >
> > Regarding 0002, I realized that the streaming option is now set to 'parallel'
> by
> > default so can avoid adjusting the option again. The test needs to be
> adjusted
> > to increase the worker limit so that a parallel worker can start. Here are the
> > updated patches.
> >
> > Best Regards,
> > Hou zj
> > <v2-0001-Fix-stats-reporting-delays-in-parallel-apply-work.patch><v2-
> 0002-Test-the-stats-report-in-parallel-apply-worker.patch>
>
> I think WaitLatch will never return WL_LATCH_SET and WL_TIMEOUT
> together, so we can do “else if (rc & WL_TIMEOUT)
> && !IsTransactionState())”, so that upon WL_LATCH_SET, it skips the
> WL_TIMEOUT check, which could be slightly more efficient.

I'm not sure we should assume that WaitLatch will set only one flag at a time.
even if that assumption holds for this specific case, handling bit flags this way looks a bit odd.
AFAICS, we don't use this style elsewhere in the code.
Currently, users of WL_TIMEOUT (in basebackup_throttle.c, walreceiver.c, worker.c)
all use if ... if logic.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-04-17 09:30:14 Re: Fix stats reporting delays in logical parallel apply worker
Previous Message Chao Li 2026-04-17 09:17:15 Re: [PATCH] Compressed TOAST data corruption with REPACK CONCURRENTLY