Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)

From: "Hsu, John" <hsuchen(at)amazon(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)
Date: 2022-02-25 21:52:03
Message-ID: b7561922-c241-2e3c-0da9-cd01bf4e6ebf@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

On 2/25/22 11:38 AM, Nathan Bossart wrote:

> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Fri, Feb 25, 2022 at 08:31:37PM +0530, Bharath Rupireddy wrote:
>> Thanks Satya and others for the inputs. Here's the v1 patch that
>> basically allows async wal senders to wait until the sync standbys
>> report their flush lsn back to the primary. Please let me know your
>> thoughts.
> I haven't had a chance to look too closely yet, but IIUC this adds a new
> function that waits for synchronous replication. This new function
> essentially spins until the synchronous LSN has advanced.
>
> I don't think it's a good idea to block sending any WAL like this. AFAICT
> it is possible that there will be a lot of synchronously replicated WAL
> that we can send, and it might just be the last several bytes that cannot
> yet be replicated to the asynchronous standbys. І believe this patch will
> cause the server to avoid sending _any_ WAL until the synchronous LSN
> advances.
>
> Perhaps we should instead just choose the SendRqstPtr based on the current
> synchronous LSN. Presumably there are other things we'd need to consider,
> but in general, I think we ought to send as much WAL as possible for a
> given call to XLogSendPhysical().

I think you're right that we'll avoid sending any WAL until sync_lsn
advances. We could setup a contrived situation where the async-walsender
never advances because it terminates before the flush_lsn of the
synchronous_node catches up. And when the async-walsender restarts,
it'll start with the latest flushed on the primary and we could go into
a perpetual loop.

I took a look at the patch and tested basic streaming with async
replicas ahead of the synchronous standby and with logical clients as
well and it works as expected.

>
> ereport(LOG,
>            (errmsg("async standby WAL sender with request LSN %X/%X
is waiting as sync standbys are ahead with flush LSN %X/%X",
>                    LSN_FORMAT_ARGS(flushLSN),
LSN_FORMAT_ARGS(sendRqstPtr)),
>             errhidestmt(true)));

I think this log formatting is incorrect.
s/sync standbys are ahead/sync standbys are behind/ and I think you need
to swap flushLsn and sendRqstPtr

When a walsender is waiting for the lsn on the synchronous replica to
advance and a database stop is issued to the writer, the pg_ctl stop
isn't able to proceed and the database seems to never shutdown.

> Assert(priority >= 0);

What's the point of the assert here?

Also the comments/code refer to AsyncStandbys, however it's also used
for logical clients, which may or may not be standbys. Don't feel too
strongly about the naming here but something to note.

> if (!ShouldWaitForSyncRepl())
>        return;
> ...
> for (;;)
> {
>    // rest of work
> }

If we had a walsender already waiting for an ack, and the conditions of
ShouldWaitForSyncRepl() change, such as disabling
async_standbys_wait_for_sync_replication or synchronous replication
it'll still wait since we never re-check the condition.

postgres=# select wait_event from pg_stat_activity where wait_event like
'AsyncWal%';
              wait_event
--------------------------------------
 AsyncWalSenderWaitForSyncReplication
 AsyncWalSenderWaitForSyncReplication
 AsyncWalSenderWaitForSyncReplication
(3 rows)

postgres=# show synchronous_standby_names;
 synchronous_standby_names
---------------------------

(1 row)

postgres=# show async_standbys_wait_for_sync_replication;
 async_standbys_wait_for_sync_replication
------------------------------------------
 off
(1 row)

>    LWLockAcquire(SyncRepLock, LW_SHARED);
>    flushLSN = walsndctl->lsn[SYNC_REP_WAIT_FLUSH];
>    LWLockRelease(SyncRepLock);

Should we configure this similar to the user's setting of
synchronous_commit instead of just flush? (SYNC_REP_WAIT_WRITE,
SYNC_REP_WAIT_APPLY)

Thanks,

John H

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-02-25 22:00:12 Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Previous Message Melanie Plageman 2022-02-25 21:31:23 Re: why do hash index builds use smgrextend() for new splitpoint pages