Re: Allow async standbys wait for sync replication

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Allow async standbys wait for sync replication
Date: 2022-03-05 08:44:54
Message-ID: CALj2ACUpRdtM_6RpumzNbO6D_XbSjVMfnNp5yVOnkKUnS4Z+xA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 5, 2022 at 1:26 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Wed, Mar 02, 2022 at 09:47:09AM +0530, Bharath Rupireddy wrote:
> > On Wed, Mar 2, 2022 at 2:57 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> >> I think there are a couple of advantages. For one, spinning is probably
> >> not the best from a resource perspective.
> >
> > Just to be on the same page - by spinning do you mean - the async
> > walsender waiting for the sync flushLSN in a for-loop with
> > WaitLatch()?
>
> Yes.
>
> >> Also, this approach might fit in better
> >> with the existing synchronous replication framework. When a WAL sender
> >> realizes that it can't send up to the current "flush" LSN because it's not
> >> synchronously replicated, it will request to be alerted when it is.
> >
> > I think you are referring to the way a backend calls SyncRepWaitForLSN
> > and waits until any one of the walsender sets syncRepState to
> > SYNC_REP_WAIT_COMPLETE in SyncRepWakeQueue. Firstly, SyncRepWaitForLSN
> > blocking i.e. the backend spins/waits in for (;;) loop until its
> > syncRepState becomes SYNC_REP_WAIT_COMPLETE. The backend doesn't do
> > any other work but waits. So, spinning isn't avoided completely.
> >
> > Unless, I'm missing something, the existing syc repl queue
> > (SyncRepQueue) mechanism doesn't avoid spinning in the requestors
> > (backends) SyncRepWaitForLSN or in the walsenders SyncRepWakeQueue.
>
> My point is that there are existing tools for alerting processes when an
> LSN is synchronously replicated and for waking up WAL senders. What I am
> proposing wouldn't involve spinning in XLogSendPhysical() waiting for
> synchronous replication. Like SyncRepWaitForLSN(), we'd register our LSN
> in the queue (SyncRepQueueInsert()), but we wouldn't sit in a separate loop
> waiting to be woken. Instead, SyncRepWakeQueue() would eventually wake up
> the WAL sender and trigger another iteration of WalSndLoop().

I understand. Even if we use the SyncRepWaitForLSN approach, the async
walsenders will have to do nothing in WalSndLoop() until the sync
walsender wakes them up via SyncRepWakeQueue. For sure, the
SyncRepWaitForLSN approach avoids extra looping and makes the code
look better. One concern is that increased burden on SyncRepLock the
SyncRepWaitForLSN approach will need to take
(LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);), now that the async
walsenders will get added to the list of backends that contened for
SyncRepLock. Whereas the other approach that I earlier proposed would
require SyncRepLock shared mode as it just needs to read the flushLSN.
I'm not sure if it's a bigger problem.

Having said above, I agree that the SyncRepWaitForLSN approach makes
things probably easy and avoids the new wait loops.

Let me think more and work on this approach.

Regards,
Bharath Rupireddy.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-03-05 08:54:18 Re: pl/pgsql feature request: shorthand for argument and local variable references
Previous Message Julien Rouhaud 2022-03-05 08:38:30 Re: ICU for global collation