Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Subject: Re: Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN
Date: 2023-07-26 04:13:39
Message-ID: CAA4eK1Kem-J5NM7GJCgyKP84pEN6RsG6JWo=6pSn1E+iexL1Fw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 25, 2023 at 10:33 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2023-07-25 14:31:00 +0530, Amit Kapila wrote:
> > To ensure that all the data has been sent during the upgrade, we can
> > ensure that each logical slot's confirmed_flush_lsn (position in the
> > WAL till which subscriber has confirmed that it has applied the WAL)
> > is the same as current_wal_insert_lsn. Now, because we don't send
> > XLOG_CHECKPOINT_SHUTDOWN even on clean shutdown, confirmed_flush_lsn
> > will never be the same as current_wal_insert_lsn. The one idea being
> > discussed in patch [1] (see 0003) is to ensure that each slot's LSN is
> > exactly XLOG_CHECKPOINT_SHUTDOWN ago which probably has some drawbacks
> > like what if we tomorrow add some other WAL in the shutdown checkpoint
> > path or the size of record changes then we would need to modify the
> > corresponding code in upgrade.
>
> Yea, that doesn't seem like a good path. But there is a variant that seems
> better: We could just scan the end of the WAL for records that should have
> been streamed out?
>

This sounds like a better idea. So, one way to realize this is that
group slots based on confirmed_flush_lsn and then scan based on that.
Once we ensure that the slot group with the highest
confirm_flush_location is up-to-date (doesn't have any pending WAL
except for shutdown_checkpoint), any slot group having a lesser value
of confirm_flush_location would be considered a group with pending
data.

BTW, I think the main downside for not trying to send
XLOG_CHECKPOINT_SHUTDOWN for logical walsenders is that even if today
there is no risk of any hint bit updates (or any other possibility of
generating WAL) during decoding of XLOG_CHECKPOINT_SHUTDOWN but there
is no future guarantee of the same. Is there anything I am missing
here?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-07-26 04:22:22 Re: psql: Could we get "-- " prefixing on the **** QUERY **** outputs? (ECHO_HIDDEN)
Previous Message Amit Kapila 2023-07-26 04:07:39 Re: logical decoding and replication of sequences, take 2