Pre-allocating WAL files

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Pre-allocating WAL files
Date: 2020-12-25 20:09:53
Message-ID: 20201225200953.jjkrytlrzojbndh5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When running write heavy transactional workloads I've many times
observed that one needs to run the benchmarks for quite a while till
they get to their steady state performance. The most significant reason
for that is that initially WAL files will not get recycled, but need to
be freshly initialized. That's 16MB of writes that need to synchronously
finish before a small write transaction can even start to be written
out...

I think there's two useful things we could do:

1) Add pg_wal_preallocate(uint64 bytes) that ensures (bytes +
segment_size - 1) / segment_size WAL segments exist from the current
point in the WAL. Perhaps with the number of bytes defaulting to
min_wal_size if not explicitly specified?

2) Have checkpointer (we want walwriter to run with low latency to flush
out async commits etc) occasionally check if WAL files need to be
pre-allocated.

Checkpointer already tracks the amount of WAL that's expected to be
generated till the end of the checkpoint, so it seems like it's a
pretty good candidate to do so.

To keep checkpointer pre-allocating when idle we could signal it
whenever a record has crossed a segment boundary.

With a plain pgbench run I see a 2.5x reduction in throughput in the
periods where we initialize WAL files.

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2020-12-25 20:12:44 Re: pgsql: Add key management system
Previous Message Tom Lane 2020-12-25 19:53:10 Re: pgsql: Add key management system