| From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
|---|---|
| To: | cca5507(at)qq(dot)com |
| Cc: | suryapoondla4(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: [BUG] Take a long time to reach consistent after pg_rewind |
| Date: | 2026-06-12 08:33:30 |
| Message-ID: | 20260612.173330.2199357374649405029.horikyota.ntt@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello.
At Thu, 11 Jun 2026 15:50:19 +0800, "cca5507" <cca5507(at)qq(dot)com> wrote in
> > I think pg_rewind is probably using the insert LSN because it wants to
> > choose a conservative position as far ahead as possible. It might be
> > possible to use the flush LSN if the copying logic is carefully
> > arranged, but I would prefer to keep using the insert LSN if we can.
>
> The insert LSN is not crash safe, is this really make sense to use it? For
> example, the primary has insert LSN 1000, flush LSN 500, the standby
> sets minRecoveryPoint to 1000, and then the primary crash and restart.
> The primary now only has LSN 500, but the standby cannot reach
> consistent until LSN 1000. This doesn’t make sense to me.
My understanding had been that the state produced by pg_rewind only
needed to be valid with respect to the source server at the time
pg_rewind was run.
If the expectation is that the rewound standby must also remain usable
after the source server subsequently goes through crash recovery, then
I agree that using the insert LSN becomes harder to justify.
> I had thought about this before. To update minRecoveryPoint in place, I think we
> should make sure that it won't cause any side effects. That means we need to
> check every places we use minRecoveryPoint. That's why the v1 patch introduces
> GetEffectiveMinRecoveryPoint() rather than updates it in place.
The reason I suggested that approach in my earlier email was simply
that, as far as I could tell, that was the only place that needed to
interpret a minRecoveryPoint value. That said, since minRecoveryPoint
is expected to point to the end of the last required record, I think
normalizing the value earlier would also be reasonable.
Anyway, if we decide to use the flush LSN instead, then none of this
should be necessary.
Regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Sharma | 2026-06-12 08:36:40 | Re: Report bytes and transactions actually sent downtream |
| Previous Message | Daniel Gustafsson | 2026-06-12 08:18:44 | Re: Use \if/\endif to remove non-libxml2 expected output in regression tests |