| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, jian he <jian(dot)universality(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
| Subject: | Re: Implement waiting for wal lsn replay: reloaded |
| Date: | 2026-04-07 03:31:04 |
| Message-ID: | jzq5shdewncpxc35r3s2mcfsmo4bjovkza5mnqf5bdfumhfi3g@bglckf7dxmw5 |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On 2026-04-06 23:07:45 -0400, Andres Freund wrote:
> But, leaving that aside, looking at this code I'm somewhat concerned - it
> seems to not worry at all about memory ordering?
>
>
> static void
> XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr, TimeLineID tli)
> ...
> /* Update shared-memory status */
> pg_atomic_write_u64(&WalRcv->writtenUpto, LogstreamResult.Write);
>
> /*
> * If we wrote an LSN that someone was waiting for, notify the waiters.
> */
> if (waitLSNState &&
> (LogstreamResult.Write >=
> pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_WRITE])))
> WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_WRITE, LogstreamResult.Write);
>
> There are no memory barriers here, so the CPU would be entirely free to not
> make the writtenUpto write visible to a waiter that's in the process of
> registering and is checking whether it needs to wait in WaitForLSN().
>
> And WaitForLSN()->GetCurrentLSNForWaitType()->GetWalRcvWriteRecPtr() also has
> no barriers. That MAYBE is ok, due addLSNWaiter() providing the barrier at
> loop entry and maybe kinda you can think that WaitLatch() will somehow also
> have barrier semantic. But if so, that would need to be very carefully
> documented. And it seems completely unnecessary here, it's hard to believe
> using a barrier (via pg_atomic_read_membarrier_u64() or such) would be a
> performance issue
And separately from the memory ordering, how can it make sense that there's
at least 5 copies of this
if (waitLSNState &&
(LogstreamResult.Flush >=
pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_FLUSH])))
WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_FLUSH, LogstreamResult.Flush);
around? That needs to be encapsulated so that if you have a bug, like the
memory ordering problem I describe above, it can be fixed once, not in
multiple places.
And why do these callers even have that pre-check? Seems WaitLSNWakeup()
does so itself?
/*
* Fast path check. Skip if currentLSN is InvalidXLogRecPtr, which means
* "wake all waiters" (e.g., during promotion when recovery ends).
*/
if (XLogRecPtrIsValid(currentLSN) &&
pg_atomic_read_u64(&waitLSNState->minWaitedLSN[i]) > currentLSN)
return;
And why is the code checking if waitLSNState is non-NULL?
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2026-04-07 03:32:54 | Re: pgsql: Reduce log level of some logical decoding messages from LOG to D |
| Previous Message | Amit Kapila | 2026-04-07 03:18:02 | Re: pgsql: Reduce log level of some logical decoding messages from LOG to D |