| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Subject: | Re: Fix stats reporting delays in logical parallel apply worker |
| Date: | 2026-04-17 09:30:14 |
| Message-ID: | 56F4B26D-F9B6-415D-B51A-AC96EA771006@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Apr 17, 2026, at 17:20, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Friday, April 17, 2026 3:41 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>>
>>> On Apr 17, 2026, at 11:35, Zhijie Hou (Fujitsu) <houzj(dot)fnst(at)fujitsu(dot)com>
>> wrote:
>>>
>>> On Friday, April 17, 2026 11:01 AM Zhijie Hou (Fujitsu)
>> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>>>> Hi,
>>>>
>>>> When implementing another feature, I noticed that parallel apply workers
>>>> currently do not report statistics while idle in their main loop. This can
>> cause
>>>> stats from the last processed transaction to be arbitrarily delayed,
>> especially
>>>> when there are long gaps between streamed transactions.
>>>>
>>>> The issue is demonstrated in 0002, where a TAP test fails when attempting
>> to
>>>> collect stats from a parallel apply worker that has no subsequent
>> transaction
>>>> to
>>>> trigger a stats report.
>>>>
>>>> 0001 fixes this issue by forcing a stats report when the worker is idle in the
>>>> main loop, matching the behavior already present in
>> LogicalRepApplyLoop()
>>>> for
>>>> regular logical apply workers.
>>>
>>> Regarding 0002, I realized that the streaming option is now set to 'parallel'
>> by
>>> default so can avoid adjusting the option again. The test needs to be
>> adjusted
>>> to increase the worker limit so that a parallel worker can start. Here are the
>>> updated patches.
>>>
>>> Best Regards,
>>> Hou zj
>>> <v2-0001-Fix-stats-reporting-delays-in-parallel-apply-work.patch><v2-
>> 0002-Test-the-stats-report-in-parallel-apply-worker.patch>
>>
>> I think WaitLatch will never return WL_LATCH_SET and WL_TIMEOUT
>> together, so we can do “else if (rc & WL_TIMEOUT)
>> && !IsTransactionState())”, so that upon WL_LATCH_SET, it skips the
>> WL_TIMEOUT check, which could be slightly more efficient.
>
> I'm not sure we should assume that WaitLatch will set only one flag at a time.
> even if that assumption holds for this specific case, handling bit flags this way looks a bit odd.
> AFAICS, we don't use this style elsewhere in the code.
> Currently, users of WL_TIMEOUT (in basebackup_throttle.c, walreceiver.c, worker.c)
> all use if ... if logic.
>
> Best Regards,
> Hou zj
WL_TIMEOUT is not a real event. If we look at the code of WaitLatch:
```
if (WaitEventSetWait(LatchWaitSet,
(wakeEvents & WL_TIMEOUT) ? timeout : -1,
&event, 1,
wait_event_info) == 0)
return WL_TIMEOUT;
else
return event.events;
```
WL_TIMEOUT won’t be union with other events at all.
Anyway, that’s not a big concern.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2026-04-17 10:13:03 | Re: Row pattern recognition |
| Previous Message | Zhijie Hou (Fujitsu) | 2026-04-17 09:20:15 | RE: Fix stats reporting delays in logical parallel apply worker |