Quick Links

pg_restore causing ENOSPACE on the WAL partition

From:	Dimitrios Apostolou <jimis(at)gmx(dot)net>
To:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	pg_restore causing ENOSPACE on the WAL partition
Date:	2025-04-10 11:19:17
Message-ID:	076464ad-3d70-dd25-9e8f-e84f27decfba@gmx.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hello list,

last night I got ENOSPACE on the WAL partition while running a huge
pg_restore on an empty and idle database.
The checkpoint that started 7 minutes earlier never finished:

04:31:09 LOG: checkpoint starting: wal
...
04:38:04 PANIC: could not write to file "pg_wal/xlogtemp.546204": No space left on device
:CONTEXT: COPY table_partition_123, line 120872534

This is most likely because most of the data is written to a very slow
"archival" tablespace. No I/O errors on the system logs, I know things go
slow on that device (iSCSI drive over a shared spinning-disks pool and
busy network).

This happened even though I'm keeping the WAL in its own partition (128GB
partition size), with a lot of free space above max_wal_size (64GB).
Somehow it managed to grow above max_wal_size and fill 100% of the
partition.

I'm running latest PostgreSQL 17 and the settings have been temporarily
tweaked for fastest pg_restore:

max_wal_size=64GB
max_replication_slots = 0
max_logical_replication_workers = 0
max_wal_senders = 0
wal_level = minimal
autovacuum = off

Several things seem to have gone wrong here. Questions:

+ The WAL partition is much faster than the archival tablespace. Am I in
constant danger of overruning max_wal_size? How to make 100% sure this
never happens again?

+ After recovery, with the database idling, I notice that WAL space usage
is constant at 64GB. Why doesn't it free up space down to min_wal_size (1GB)?

+ I just created a 13GB zstd-compressed tarball of those 64GB WAL
files. This indicates that the files are compressible despite using
wal_compression=zstd setting. Could it be that postgres ignores the flag
and does not compress the WAL? How to check?

+ I'm using parallel pg_restore --data-only, can't avoid that for now.
Even though all the tables are empty (I truncated everything before
starting pg_restore), I can't find a way to avoid going through the WAL.
Ideas?

Thanks in advance,
Dimitris

Responses

Re: pg_restore causing ENOSPACE on the WAL partition at 2025-04-10 12:11:57 from Dimitrios Apostolou
Re: pg_restore causing ENOSPACE on the WAL partition at 2025-04-11 11:38:26 from Dimitrios Apostolou

Browse pgsql-general by date

	From	Date	Subject
Next Message	Francisco Olarte	2025-04-10 11:30:27	Re: [EXTERNAL] Re: Cannot connect local with ttrust (pg_upgrade assumes it)
Previous Message	Tomas Vondra	2025-04-10 10:07:45	Re: Meson and Numa: C header not found