From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | masao(dot)fujii(at)oss(dot)nttdata(dot)com |
Cc: | bharath(dot)rupireddyforpostgres(at)gmail(dot)com, nathandbossart(at)gmail(dot)com, sfrost(at)snowman(dot)net, bossartn(at)amazon(dot)com, rjuju123(at)gmail(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Add checkpoint and redo LSN to LogCheckpointEnd log message |
Date: | 2022-02-07 03:02:58 |
Message-ID: | 20220207.120258.310426179780547983.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Mon, 07 Feb 2022 10:16:34 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> At Fri, 4 Feb 2022 10:59:04 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
> > On 2022/02/03 15:50, Kyotaro Horiguchi wrote:
> > > By the way, restart point should start only while recoverying, and at
> > > the timeof the start both checkpoint.redo and checkpoint LSN are
> > > already past. We shouldn't update minRecovery point after promotion,
> > > but is there any reason for not updating the checkPoint and
> > > checkPointCopy? If we update them after promotion, the
> > > which-LSN-to-show problem would be gone.
> >
> > I tried to find the reason by reading the past discussion, but have
> > not found that yet.
> >
> > If we update checkpoint and REDO LSN at pg_control in that case, we
> > also need to update min recovery point at pg_control? Otherwise the
> > min recovery point at pg_control still indicates the old LSN that
> > previous restart point set.
>
> I had an assuption that the reason I think it shouldn't update
> minRecoveryPoint is that it has been or is going to be reset to
> invalid LSN by promotion and the checkpoint should refrain from
> touching it.
Hmm.. It doesn't seem to be the case. If a server crashes just after
promotion and before requesting post-promtion checkpoint,
minRecoveryPoint stays at a valid LSN.
(Promoted at 0/7000028)
Database cluster state: in production
Latest checkpoint location: 0/6000060
Latest checkpoint's REDO location: 0/6000028
Latest checkpoint's REDO WAL file: 000000010000000000000006
Minimum recovery ending location: 0/7000090
Min recovery ending loc's timeline: 2
minRecoveryPoint/TLI are ignored in any case where a server in
in-production state is started. In other words, the values are
useless. There's no clear or written reason for unrecording the last
ongoing restartpoint after promotion.
Before fast-promotion was introduced, we shouldn't get there after
end-of-recovery checkpoint (but somehow reached sometimes?) but it is
quite normal nowadays. Or to the contrary, we're expecting it to
happen and it is regarded as a normal checkponit. So we should do
there nowadays are as the follows.
- If any later checkpoint/restartpoint has been established, just skip
remaining task then return false. (!chkpt_was_latest)
(I'm not sure this can happen, though.)
- we update control file only when archive recovery is still ongoing.
- Otherwise reset minRecoveryPoint then continue.
Do you have any thoughts or opinions?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2022-02-07 03:04:14 | Re: GUC flags |
Previous Message | Michael Paquier | 2022-02-07 02:53:41 | Re: pg_receivewal - couple of improvements |