Re: Fix stats reporting delays in logical parallel apply worker

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Fix stats reporting delays in logical parallel apply worker
Date: 2026-04-17 09:30:14
Message-ID: 56F4B26D-F9B6-415D-B51A-AC96EA771006@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Apr 17, 2026, at 17:20, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, April 17, 2026 3:41 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>>> On Apr 17, 2026, at 11:35, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
>> wrote:
>>>
>>> On Friday, April 17, 2026 11:01 AM Zhijie Hou (Fujitsu)
>> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>>>> Hi,
>>>>
>>>> When implementing another feature, I noticed that parallel apply workers
>>>> currently do not report statistics while idle in their main loop. This can
>> cause
>>>> stats from the last processed transaction to be arbitrarily delayed,
>> especially
>>>> when there are long gaps between streamed transactions.
>>>>
>>>> The issue is demonstrated in 0002, where a TAP test fails when attempting
>> to
>>>> collect stats from a parallel apply worker that has no subsequent
>> transaction
>>>> to
>>>> trigger a stats report.
>>>>
>>>> 0001 fixes this issue by forcing a stats report when the worker is idle in the
>>>> main loop, matching the behavior already present in
>> LogicalRepApplyLoop()
>>>> for
>>>> regular logical apply workers.
>>>
>>> Regarding 0002, I realized that the streaming option is now set to 'parallel'
>> by
>>> default so can avoid adjusting the option again. The test needs to be
>> adjusted
>>> to increase the worker limit so that a parallel worker can start. Here are the
>>> updated patches.
>>>
>>> Best Regards,
>>> Hou zj
>>> <v2-0001-Fix-stats-reporting-delays-in-parallel-apply-work.patch><v2-
>> 0002-Test-the-stats-report-in-parallel-apply-worker.patch>
>>
>> I think WaitLatch will never return WL_LATCH_SET and WL_TIMEOUT
>> together, so we can do “else if (rc & WL_TIMEOUT)
>> && !IsTransactionState())”, so that upon WL_LATCH_SET, it skips the
>> WL_TIMEOUT check, which could be slightly more efficient.
>
> I'm not sure we should assume that WaitLatch will set only one flag at a time.
> even if that assumption holds for this specific case, handling bit flags this way looks a bit odd.
> AFAICS, we don't use this style elsewhere in the code.
> Currently, users of WL_TIMEOUT (in basebackup_throttle.c, walreceiver.c, worker.c)
> all use if ... if logic.
>
> Best Regards,
> Hou zj

WL_TIMEOUT is not a real event. If we look at the code of WaitLatch:
```
if (WaitEventSetWait(LatchWaitSet,
(wakeEvents & WL_TIMEOUT) ? timeout : -1,
&event, 1,
wait_event_info) == 0)
return WL_TIMEOUT;
else
return event.events;
```
WL_TIMEOUT won’t be union with other events at all.

Anyway, that’s not a big concern.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2026-04-17 10:13:03 Re: Row pattern recognition
Previous Message Zhijie Hou (Fujitsu) 2026-04-17 09:20:15 RE: Fix stats reporting delays in logical parallel apply worker