Re: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers

From: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers
Date: 2022-01-08 23:28:13
Message-ID: CAHg+QDfO2Fhtb01oiq_4F_9n-Y2HRX4NPTY1ax9Bqse7NiYgmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 7, 2022 at 4:52 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> On Fri, 2022-01-07 at 14:54 -0800, Andres Freund wrote:
> > > If you only promote the furthest-ahead sync replica (which is what
> > > you
> > > should be doing if you have quorum commit), why wouldn't that work?
> >
> > Remove "sync" from the above sentence, and the sentence holds true
> > for
> > combinations of sync/async replicas as well.
>
> Technically that's true, but it seems like a bit of a strange use case.
> I would think people doing that would just include those async replicas
> in the sync quorum instead.
>
> The main case I can think of for a mix of sync and async replicas are
> if they are just managed differently. For instance, the sync replica
> quorum is managed for a core part of the system, strategically
> allocated on good hardware in different locations to minimize the
> chance of dependent failures; while the async read replicas are
> optional for taking load off the primary, and may appear/disappear in
> whatever location and on whatever hardware is most convenient.
>
> But if an async replica can get ahead of the sync rep quorum, then the
> most recent transactions can appear in query results, so that means the
> WAL shouldn't be lost, and the async read replicas become a part of the
> durability model.
>
> If the async read replica can't be promoted because it's not suitable
> (due to location, hardware, whatever), then you need to frantically
> copy the final WAL records out to an instance in the sync rep quorum.
> That requires extra ceremony for every failover, and might be dubious
> depending on how safe the WAL on your async read replicas is, and
> whether there are dependent failure risks.
>

This may not even be possible always as described in the scenario below.

>
> Yeah, I guess there could be some use case woven amongst those caveats,
> but I'm not sure if anyone is actually doing that combination of things
> safely today. If someone is, it would be interesting to know more about
> that use case.
>
> The proposal in this thread is quite a bit simpler: manage your sync
> quorum and your async read replicas separately, and keep the sync rep
> quorum ahead.
>
> > > > To me this just sounds like trying to shoehorn something into
> > > > syncrep
> > > > that
> > > > it's not made for.
> > >
> > > What *is* sync rep made for?
>
> This was a sincere question and an answer would be helpful. I think
> many of the discussions about sync rep get derailed because people have
> different ideas about when and how it should be used, and the
> documentation is pretty light.
>
> > This is a especially relevant in cases where synchronous_commit=on vs
> > local is
> > used selectively
>
> That's an interesting point.
>
> However, it's hard for me to reason about "kinda durable" and "a little
> more durable" and I'm not sure how many people would care about that
> distinction.
>
> > I don't see that. This presumes that WAL replicated to async replicas
> > is
> > somehow bad.
>
> Simple case: primary and async read replica are in the same server
> rack. Sync replicas are geographically distributed with quorum commit.
> Read replica gets the WAL first (because it's closest), starts
> answering queries that include that WAL, and then the entire rack
> catches fire. Now you've returned results to the client, but lost the
> transactions.
>

Another similar example is, in a multi-AZ HA setup, primary and sync
replicas are deployed in two different availability zones and the async
replicas for reads can be in any availability zone and assume the async
replica and primary land in the same AZ. Primary availability zone going
down leads to both primary and async replica going down at the same time.
This async replica could be ahead of sync replica and WAL can't be
collected as both primary and async replica failed together.

>
> Regards,
> Jeff Davis
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-01-08 23:41:56 Re: Why is src/test/modules/committs/t/002_standby.pl flaky?
Previous Message Justin Pryzby 2022-01-08 22:07:02 Re: warn if GUC set to an invalid shared library