Re: sync_standbys_defined read/write race on startup

From: "Maksim(dot)Melnikov" <m(dot)melnikov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: sync_standbys_defined read/write race on startup
Date: 2025-03-28 07:57:58
Message-ID: 884d9875-d35d-44fb-993d-f9326e5282a8@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 27.03.2025 01:13, Michael Paquier wrote:
> Tracking if the configuration has been properly loaded in
> WalSndCtlData makes sense here, but I am confused by the patch: you
> have defined SYNCSTANDBYSDEFINED but sync_standbys_defined never sets
> it. It may be simpler to use a separate boolean flag for this
> purpose. WalSndCtlData is a private structure holding the state of
> the WAL senders internally, so ABI does not stress me much here
> (flexible array at the end of the structure, as well..).

Thanks for your responses.
Yes, I agree that it looks more complicated then addition new bool field,
but there is place in SyncRepWaitForLSN method, where we check
WalSndCtlData->sync_standbys_defined out of acquire/release block.

void
SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
{
    int            mode;
...............................................
*    if (!SyncRepRequested() ||
        !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
        return;*
..........................

If we just add new bool field, we can't atomically check
sync_standbys_defined flag
and new bool field, only in lock, so we lose fast exit optimization. But
I agree
that in patch I lost SYNCSTANDBYSDEFINED setting, thanks for attentiveness,
so I've attached new patch with fix,also I've done bits setting more
evident.

Anyway, if you still thinks that my arguments not enough, please notify,
I will do patch with separate flag.

On 27.03.2025 01:13, Michael Paquier wrote:
> Also mentioned by Kirill downthread: let's add a regression test with
> an injection point that does a wait. This race condition is worth
> having a test for. You could define the point of where you have added
> your hardcoded sleep(), for example.
Sorry, but for me it seems that we can't use injection points toolset
here, because we are talking
about startup process, and we can't call injection_points_attach on
client backend before checkpointer.

Anyway, if you know how to do it, please share details, I will do.

Best regards,

Maksim Melnikov

Attachment Content-Type Size
0002-Fix-sync_standbys_defined-race-on-startup.patch text/x-patch 5.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2025-03-28 08:13:54 Re: Thread-safe nl_langinfo() and localeconv()
Previous Message Kirill Reshke 2025-03-28 07:52:45 Re: duplicated comments on get_relation_constraints