Re: Some thoughts about the TAP tests' wait_for_catchup()

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Some thoughts about the TAP tests' wait_for_catchup()
Date: 2021-09-30 05:14:26
Message-ID: CAA4eK1LtWy+JdxTETdgcbCrExydPvyynuwdKnDU2M4LpTU7k1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 29, 2021 at 9:29 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> > On Wed, Sep 29, 2021 at 3:47 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> It seems to me that for most purposes wait_for_catchup's approach is
> >> strictly worse, for two reasons:
> >> 1. It continually recomputes the primary's pg_current_wal_lsn().
> >> 2. It's querying the primary's view of the standby's state, which
> >> introduces a reporting delay.
>
> > I can't comment on all the use cases of wait_for_catchup() but I think
> > there are some use cases in logical replication where we need the
> > publisher to use wait_for_catchup after setting up the replication to
> > ensure that wal sender is started and in-proper state by checking its
> > state (which should be 'streaming'). That also implicitly checks if
> > the wal receiver has responded to initial ping requests by sending
> > replay location.
>
> Yeah, for logical replication we can't look at the subscriber's WAL
> positions because they could be totally different. What I'm on
> about is the tests that use physical replication. I think examining
> the standby's state directly is better in that case, for the reasons
> I cited.
>
> I guess the question of interest is whether it's sufficient to test
> the walreceiver feedback mechanisms in the context of logical
> replication, or whether we feel that the physical-replication code
> path is enough different that there should be a specific test for
> that combination too.
>

There is a difference in the handling of feedback messages for
physical and logical replication code paths. It is mainly about how we
advance slot's lsn based on wal flushed. See
ProcessStandbyReplyMessage, towards end, we call different functions
based on slot_type. So, I think it is better to have a test for the
physical replication feedback mechanism.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-09-30 05:45:20 Re: Skipping logical replication transactions on subscriber side
Previous Message Bossart, Nathan 2021-09-30 04:47:34 Re: parallelizing the archiver