Re: pg_rewind: warn when checkpoint hasn't happened after promotion

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bharath(dot)rupireddyforpostgres(at)gmail(dot)com
Cc: jtc331(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_rewind: warn when checkpoint hasn't happened after promotion
Date: 2022-06-06 05:26:02
Message-ID: 20220606.142602.2160457289831431243.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Sat, 4 Jun 2022 19:09:41 +0530, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote in
> On Sat, Jun 4, 2022 at 6:29 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> >
> > A few weeks back I sent a bug report [1] directly to the -bugs mailing
> > list, and I haven't seen any activity on it (maybe this is because I
> > emailed directly instead of using the form?), but I got some time to
> > take a look and concluded that a first-level fix is pretty simple.
> >
> > A quick background refresher: after promoting a standby rewinding the
> > former primary requires that a checkpoint have been completed on the
> > new primary after promotion. This is correctly documented. However
> > pg_rewind incorrectly reports to the user that a rewind isn't
> > necessary because the source and target are on the same timeline.
...
> > Attached is a patch that detects this condition and reports it as an
> > error to the user.

I have some random thoughts on this.

There could be a problem in the case of gracefully shutdowned
old-primary, so I think it is worth doing something if it can be in a
simple way.

However, I don't think we can simply rely on minRecoveryPoint to
detect that situation, since it won't be reset on a standby. A standby
also still can be the upstream of a cascading standby. So, as
discussed in the thread for the comment [2], what we can do here would be
simply waiting for the timelineID to advance, maybe having a timeout.

In a case of single-step replication set, a checkpoint request to the
primary makes the end-of-recovery checkpoint fast. It won't work as
expected in cascading replicas, but it might be acceptable.

> > In the spirit of the new-ish "ensure shutdown" functionality I could
> > imagine extending this to automatically issue a checkpoint when this
> > situation is detected. I haven't started to code that up, however,
> > wanting to first get buy-in on that.
> >
> > 1: https://www.postgresql.org/message-id/CAAaqYe8b2DBbooTprY4v=BiZEd9qBqVLq+FD9j617eQFjk1KvQ@mail.gmail.com
>
> Thanks. I had a quick look over the issue and patch - just a thought -
> can't we let pg_rewind issue a checkpoint on the new primary instead
> of erroring out, maybe optionally? It might sound too much, but helps
> pg_rewind to be self-reliant i.e. avoiding external actor to detect
> the error and issue checkpoint the new primary to be able to
> successfully run pg_rewind on the pld primary and repair it to use it
> as a new standby.

At the time of the discussion [2] for the it was the hinderance that
that requires superuser privileges. Now that has been narrowed down
to the pg_checkpointer privileges.

If we know that the timeline IDs are different, we don't need to wait
for a checkpoint.

It seems to me that the exit status is significant. pg_rewind exits
with 1 when an invalid option is given. I don't think it is great if
we report this state by the same code.

I don't think we always want to request a non-spreading checkpoint.

[2] https://www.postgresql.org/message-id/flat/CABUevEz5bpvbwVsYCaSMV80CBZ5-82nkMzbb%2BBu%3Dh1m%3DrLdn%3Dg%40mail.gmail.com

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-06-06 05:42:31 Re: bogus: logical replication rows/cols combinations
Previous Message vignesh C 2022-06-06 05:17:09 Re: Handle infinite recursion in logical replication setup