Re: WAL restore/recovery fills pg_wal volume

From: David Steele <david(at)pgmasters(dot)net>
To: Don Seiler <don(at)seiler(dot)us>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: WAL restore/recovery fills pg_wal volume
Date: 2022-08-02 18:16:31
Message-ID: c8e99868-1151-85c8-0e3a-35ccd3e8b615@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 8/2/22 09:26, Don Seiler wrote:
>
> I'm on PG 12 on Ubuntu 18.04.
>
> I have a very large database (~18TB) with which we have a separate
> pg_wal volume configured. The pg_wal volume is around 50GB but typical
> daily usage rarely sees it over 10GB. We use pgbackrest to archive WALs
> as well as performing backups.
>
> The problem is when we are performing restore and recovery (e.g. to set
> up a new physical replica), also via pgbackrest. The DB restore works
> fine but when it comes to the WAL restore and recovery, the pg_wal
> volume will fill up before PG can clear out the already-recovered WAL
> files. This means I have to restart the database and start the recovery
> process over again. The last time I ended up writing a cron job to
> delete around 100 logs per minute just via an `rm` command based on the
> recovery rate I saw.
>
> My understanding is that the recovered WAL files would be cleared when
> the replica hits a recovery start point. My primary is configured with a
> 10 minute checkpoint_timeout. The replica pg_wal will fill up before 10
> minutes. The primary and replica have the same size pg_wal volume but
> the primary never comes close to filling up, as I said before.
>
> I believe two options (aside from the ugly rm cron job) would be to
> either shorten the checkpoint_timeout on the primary, which would be
> hard to do due to the activity level, or make a larger pg_wal volume
> (trial and error to determine just how much larger?).
>
> I'm interested to know if there's anything else I can do to avoid the
> toil when we do these restores and also if maybe there is something
> wrong and that PG shouldn't be filling up the volume blindly.

This appears to be related to [1], which we have been discussing over on
that thread.

Regards,
-David

[1]
https://www.postgresql.org/message-id/flat/20210202151416.GB3304930%40rfd.leadboat.com

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Daulat 2022-08-03 08:57:43 Re: pg_upgrade issue, upgrading from v10 to v14
Previous Message endre_pekarik 2022-08-02 16:25:36 Table parameters - autovacuum Yes/No