From: | Ants Aasma <ants(at)cybertec(dot)at> |
---|---|
To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Standby recovers records from wrong timeline |
Date: | 2022-10-20 11:44:40 |
Message-ID: | CANwKhkPozUvyfuy1sz0fKN4=CC3TPQOF0Tr+uEVO_XX6yqDHpA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 20 Oct 2022 at 11:30, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> primary_restored did a time-travel to past a bit because of the
> recovery_target=immediate. In other words, the primary_restored and
> the replica diverge. I don't think it is legit to connect a diverged
> standby to a primary.
primary_restored did timetravel to the past, as we're doing PITR on the
primary that's the expected behavior. However replica is not diverged,
it's a copy of the exact same basebackup. The usecase is restoring a
cluster from backup using PITR and using the same backup to create a
standby. Currently this breaks when primary has not yet archived any
segments.
> So, about the behavior in doubt, it is the correct behavior to
> seemingly ignore the history file in the archive. Recovery assumes
> that the first half of the first segment of the new timeline is the
> same with the same segment of the old timeline (.partial) so it is
> legit to read the <tli=1,seg=2> file til the end and that causes the
> replica goes beyond the divergence point.
What is happening is that primary_restored has a timeline switch at
tli 2, lsn 0/2000100, and the next insert record starts in the same
segment. Replica is starting on the same backup on timeline 1, tries to
find tli 2 seg 2, which is not archived yet, so falls back to tli 1 seg 2
and replays tli 1 seg 2 continuing to tli seg 3, then connects to primary
and starts applying wal starting from tli 2 seg 4. To me that seems
completely broken.
> As you know, when new primary starts a diverged history, the
> recommended way is to blow (or stash) away the archive, then take a
> new backup from the running primary.
My understanding is that backup archives are supposed to remain valid
even after PITR or equivalently a lagging standby promoting.
--
Ants Aasma
Senior Database Engineer
www.cybertec-postgresql.com
From | Date | Subject | |
---|---|---|---|
Next Message | Erik Rijkers | 2022-10-20 12:45:50 | date_part/extract parse curiosity |
Previous Message | Marcos Pegoraro | 2022-10-20 11:35:21 | session_user and current_user on LOG |