Re: pg_rewind: warn when checkpoint hasn't happened after promotion

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: jtc331(at)gmail(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_rewind: warn when checkpoint hasn't happened after promotion
Date: 2022-07-06 02:38:42
Message-ID: 20220706.113842.34994619007220403.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 5 Jul 2022 14:46:13 -0400, James Coleman <jtc331(at)gmail(dot)com> wrote in
> On Tue, Jul 5, 2022 at 2:39 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Sat, Jun 4, 2022 at 8:59 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > A quick background refresher: after promoting a standby rewinding the
> > > former primary requires that a checkpoint have been completed on the
> > > new primary after promotion. This is correctly documented. However
> > > pg_rewind incorrectly reports to the user that a rewind isn't
> > > necessary because the source and target are on the same timeline.
> >
> > Is there anything intrinsic to the mechanism of operation of pg_rewind
> > that requires a timeline change, or could we just rewind within the
> > same timeline to an earlier LSN? In other words, maybe we could just
> > remove this limitation of pg_rewind, and then perhaps it wouldn't be
> > necessary to determine what the new timeline is.
>
> I think (someone can correct me if I'm wrong) that in theory the
> mechanisms would support the source and target being on the same
> timeline, but in practice that presents problems since you'd not have
> an LSN you could detect as the divergence point. If we allowed passing
> "rewind to" point LSN value, then that (again, as far as I understand
> it) would work, but it's a different use case. Specifically I wouldn't
> want that option to need to be used for this particular case since in
> my example there is in fact a real divergence point that we should be
> detecting automatically.

The point of pg_rewind is finding diverging point then finding all
blocks modified in the dead history (from the diverging point) and
"replace" them with those of the live history. In that sense, to be
exact, pg_rewind does not "rewind" a cluster. If no diverging point,
the last LSN of the cluster getting behind (as target cluster?) is
that and just no need to replace a block at all because no WAL exists
(on the cluster being behind) after the last LSN.

The issue here is pg_rewind looks into control file to determine the
soruce timeline, because the control file is not updated until the
first checkpoint ends after promotion finishes, even though file
blocks are already diverged.

Even in that case history file for the new timeline is already
created, so searching for the latest history file works.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2022-07-06 02:41:46 Re: Issue with pg_stat_subscription_stats
Previous Message Amit Langote 2022-07-06 02:37:57 Re: generic plans and "initial" pruning