Re: pg_rewind WAL segments deletion pitfall

From: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: bungina(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_rewind WAL segments deletion pitfall
Date: 2022-08-30 09:01:58
Message-ID: CAFh8B=mT617KYUyJ-j6gKhVmNRYamT_AcFPF1CorB7FcvbxdAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Tue, 30 Aug 2022 at 10:27, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
wrote:

>
> Hmm. Doesn't it work to ignoring tli then? All segments that their
> segment number is equal to or larger than the checkpoint locaiton are
> preserved regardless of TLI?
>

If we ignore TLI there is a chance that we may retain some unnecessary (or
just wrong) files.

>
> > Also, we need to take into account the divergency LSN. Files after it are
> > not required.
>
> They are removed at the later checkpoints. But also we can remove
> segments that are out of the range between the last common checkpoint
> and divergence point ignoring TLI.

Everything that is newer last_common_checkpoint_seg could be removed (but
it already happens automatically, because these files are missing on the
new primary).
WAL files that are older than last_common_checkpoint_seg could be either
removed or at least not copied from the new primary.

> the divergence point is also
> compared?
>
> > if (file_segno >= last_common_checkpoint_seg &&
> > file_segno <= divergence_seg)
> > <PRESERVE IT>;
>

The current implementation relies on tracking WAL files being open while
searching for the last common checkpoint. It automatically starts from the
divergence_seg, automatically finishes at last_common_checkpoint_seg, and
last but not least, automatically handles timeline changes. I don't think
that manually written code that decides what to do from the WAL file name
(and also takes into account TLI) could be much simpler than the current
approach.

Actually, since we start doing some additional "manipulations" with files
in pg_wal, we probably should do a symmetric action with files inside
pg_wal/archive_status

Regards,
--
Alexander Kukushkin

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2022-08-30 09:06:34 Re: pg_restore deadlocks with itself
Previous Message hubert depesz lubaczewski 2022-08-30 09:01:45 Re: Excessive number of replication slots for 12->14 logical replication

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-08-30 09:05:47 Re: Reducing the chunk header sizes on all memory context types
Previous Message Tomas Vondra 2022-08-30 08:33:24 Re: Reducing the chunk header sizes on all memory context types