Re: Force the old transactions logs cleanup even if checkpoint is skipped

From: "Zakhlystov, Daniil (Nebius)" <usernamedt(at)nebius(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "amborodin(at)acm(dot)org" <amborodin(at)acm(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Mokrushin, Mikhail (Nebius)" <rodrijjke(at)nebius(dot)com>
Subject: Re: Force the old transactions logs cleanup even if checkpoint is skipped
Date: 2023-11-09 11:50:10
Message-ID: D80931B7-CE13-41D9-B9B3-DE30A9001EEC@nebius.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> On 9 Nov 2023, at 01:30, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> I am not really convinced that this is worth complicating the skipped
> path for this goal. In my experience, I've seen complaints where WAL
> archiving bloat was coming from the archive command not able to keep
> up with the amount generated by the backend, particularly because the
> command invocation was taking longer than it takes to generate a new
> segment. Even if there is a hole of activity in the server, if too
> much WAL has been generated it may not be enough to catch up depending
> on the number of segments that need to be processed. Others are free
> to chime in with extra opinions, of course.

I agree that there might multiple reasons of pg_wal bloat. Please note that
I am not addressing the WAL archiving issue at all. My proposal is to add a
small improvement to the WAL cleanup routine for WALs that have been already
archived successfully to free the disk space.

Yes, it might be not a common case, but a fairly realistic one. It occurred multiple times
in our production when we had temporary issues with archiving. This small
complication of the skipped path will help Postgres to return to a normal operational
state without any human operator / external control routine intervention.

> On 9 Nov 2023, at 01:30, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> While on it, I think that your patch would cause incorrect and early
> removal of segments. It computes the name of the last segment to
> remove based on last_important_lsn, ignoring KeepLogSeg(), meaning
> that it ignores any WAL retention required by replication slots or
> wal_keep_size. And this causes the calculation of an incorrect segno
> horizon.

Please check the latest patch version, I believe that it has been already fixed there.

Thanks,

Daniil Zakhlystov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amul Sul 2023-11-09 12:00:49 Re: ALTER COLUMN ... SET EXPRESSION to alter stored generated column's expression
Previous Message Nazir Bilal Yavuz 2023-11-09 11:39:26 Re: Show WAL write and fsync stats in pg_stat_io