Re: Impact of checkpointer during pg_upgrade

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>
Subject: Re: Impact of checkpointer during pg_upgrade
Date: 2023-09-04 10:48:51
Message-ID: CAFiTN-t_-ON=B5yPDuXfsPuA8q6pBW4-yqHUBY4=AQNMgLoKTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 4, 2023 at 1:41 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> > > I think we can do better, like we can just read the latest
> > > checkpoint's LSN before starting the old cluster. And now while
> > > checking the slot can't we check if the the slot is invalidated then
> > > their confirmed_flush_lsn >= the latest_checkpoint_lsn we preserved
> > > before starting the cluster because if so then those slot might have
> > > got invalidated during the upgrade no?
> > >
> >
> > Isn't that possible only if we update confirmend_flush LSN while
> > invalidating? Otherwise, how the check you are proposing can succeed?
>
> I am not suggesting to compare the confirmend_flush_lsn to the latest
> checkpoint LSN instead I am suggesting that before starting the
> cluster we get the location of the latest checkpoint LSN that should
> be the shutdown checkpoint LSN. So now also in [1] we check that
> confirmed flush lsn should be equal to the latest checkpoint lsn. So
> the only problem is that after we restart the cluster during the
> upgrade we might invalidate some of the slots which are perfectly fine
> to migrate and we want to identify those slots. So if we know the the
> LSN of the shutdown checkpoint before the cluster started then we can
> perform a additional checks on all the invalidated slots that their
> confirmed lsn >= shutdown checkpoint lsn we preserved before
> restarting the cluster (not the latest checkpoint lsn) then those
> slots got invalidated only after we started the cluster for upgrade?
> Is there any loophole in this theory? This theory is based on the
> assumption that the confirmed flush lsn are not moving forward for the
> already invalidated slots that means the slot which got invalidated
> before we shutdown for upgrade will have confirm flush lsn value <
> shutdown checkpoint and the slots which got invalidated during the
> upgrade will have confirm flush lsn at least equal to the shutdown
> checkpoint.

Said that there is a possibility that some of the slots which got
invalidated even on the previous checkpoint might get the same LSN as
the slot which got invalidated later if there is no activity between
these two checkpoints. So if we go with this approach then there is
some risk of migrating some of the slots which were already
invalidated even before the shutdown checkpoint.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2023-09-04 10:54:15 [PATCH] Add inline comments to the pg_hba_file_rules view
Previous Message Jeevan Chalke 2023-09-04 10:21:22 Re: More new SQL/JSON item methods