Re: patch to allow disable of WAL recycling

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch to allow disable of WAL recycling
Date: 2018-07-17 00:29:34
Message-ID: 20180717002934.GC3388@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 16, 2018 at 10:38:14AM -0400, Robert Haas wrote:
> It's been a few years since I tested this, but my recollection is that
> if you fill up pg_xlog, the system will PANIC and die on a vanilla
> Linux install. Sure, you can set max_wal_size, but that's a soft
> limit, not a hard limit, and if you generate WAL faster than the
> system can checkpoint, you can overrun that value and force allocation
> of additional WAL files. So I'm not sure we have any working
> ENOSPC-panic protection today. Given that, I'm doubtful that we
> should prioritize maintaining whatever partially-working protection we
> may have today over raw performance. If we want to fix ENOSPC on
> pg_wal = PANIC, and I think that would be a good thing to fix, then we
> should do it either by finding a way to make the WAL insertion ERROR
> out instead of panicking, or throttle WAL generation as we get close
> to disk space exhaustion so that the checkpoint has time to complete,
> as previously proposed by Heroku.

I would personally prefer seeing max_wal_size being switched to a hard
limit, and make that tunable. I am wondering if that's the case for
other people on this list, but I see from time to time, every couple of
weeks, people complaining that Postgres is not able to maintain a hard
guarantee behind the value of max_wal_size. In some upgrade scenarios,
I had to tell such folks to throttle their insert load and also manually
issue checkpoints to allow the system to stay up and continue with the
upgrade process. So there are definitely cases where throttling is
useful, and if the hard limit is reached for some cases I would rather
see WAL generation from other backends simply stopped instead of risking
the system to go down so as the system can finish its checkpoint. And
sometimes this happens also with a SQL dump, where throttling the load
at the application level means more complex dump strategy so as things
are split between multiple files for example.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Edmund Horner 2018-07-17 01:31:20 Re: PATCH: psql tab completion for SELECT
Previous Message Amit Langote 2018-07-17 00:22:39 Re: partition pruning doesn't work with IS NULL clause in multikey range partition case