Re: [BUG] Take a long time to reach consistent after pg_rewind

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: cca5507(at)qq(dot)com
Cc: suryapoondla4(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [BUG] Take a long time to reach consistent after pg_rewind
Date: 2026-06-12 08:33:30
Message-ID: 20260612.173330.2199357374649405029.horikyota.ntt@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 11 Jun 2026 15:50:19 +0800, "cca5507" <cca5507(at)qq(dot)com> wrote in
> > I think pg_rewind is probably using the insert LSN because it wants to
> > choose a conservative position as far ahead as possible.  It might be
> > possible to use the flush LSN if the copying logic is carefully
> > arranged, but I would prefer to keep using the insert LSN if we can.
>
> The insert LSN is not crash safe, is this really make sense to use it? For
> example, the primary has insert LSN 1000, flush LSN 500, the standby
> sets minRecoveryPoint to 1000, and then the primary crash and restart.
> The primary now only has LSN 500, but the standby cannot reach
> consistent until LSN 1000. This doesn’t make sense to me.

My understanding had been that the state produced by pg_rewind only
needed to be valid with respect to the source server at the time
pg_rewind was run.

If the expectation is that the rewound standby must also remain usable
after the source server subsequently goes through crash recovery, then
I agree that using the insert LSN becomes harder to justify.

> I had thought about this before. To update minRecoveryPoint in place, I think we
> should make sure that it won't cause any side effects. That means we need to
> check every places we use minRecoveryPoint. That's why the v1 patch introduces
> GetEffectiveMinRecoveryPoint() rather than updates it in place.

The reason I suggested that approach in my earlier email was simply
that, as far as I could tell, that was the only place that needed to
interpret a minRecoveryPoint value. That said, since minRecoveryPoint
is expected to point to the end of the last required record, I think
normalizing the value earlier would also be reasonable.

Anyway, if we decide to use the flush LSN instead, then none of this
should be necessary.

Regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2026-06-12 08:36:40 Re: Report bytes and transactions actually sent downtream
Previous Message Daniel Gustafsson 2026-06-12 08:18:44 Re: Use \if/\endif to remove non-libxml2 expected output in regression tests