From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: Hard limit on WAL space used (because PANIC sucks) |
Date: | 2014-01-22 00:18:36 |
Message-ID: | CA+U5nMKEoPyRHv0=yuWT5_ghxNqNKEhW5vODiyqQ9=_tmuzNiA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21 January 2014 23:01, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> > On 6 June 2013 16:00, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
>> > wrote:
>> >> The current situation is that if you run out of disk space while
>> >> writing
>> >> WAL, you get a PANIC, and the server shuts down. That's awful.
>>
>> > I don't see we need to prevent WAL insertions when the disk fills. We
>> > still have the whole of wal_buffers to use up. When that is full, we
>> > will prevent further WAL insertions because we will be holding the
>> > WALwritelock to clear more space. So the rest of the system will lock
>> > up nicely, like we want, apart from read-only transactions.
>>
>> I'm not sure that "all writing transactions lock up hard" is really so
>> much better than the current behavior.
>>
>> My preference would be that we simply start failing writes with ERRORs
>> rather than PANICs. I'm not real sure ATM why this has to be a PANIC
>> condition. Probably the cause is that it's being done inside a critical
>> section, but could we move that?
>
>
> My understanding is that if it runs out of buffer space while in an
> XLogInsert, it will be holding one or more buffer content locks exclusively,
> and unless it can complete the xlog (or scrounge up the info to return that
> buffer to its previous state), it can never release that lock. There might
> be other paths were it could get by with an ERROR, but if no one can write
> xlog anymore, all of those paths must quickly converge to the one that
> cannot simply ERROR.
Agreed. You don't say it but I presume you intend to point out that
such long-lived contention could easily have a knock on effect to
other read-only statements. I'm pretty sure other databases work the
same way.
Our choice are
1. Waiting
2. Abort transactions
3. Some kind of release-locks-then-wait-and-retry
(3) is a step too far for me, even though it is easier than you say
since we write WAL before changing the data block so a failure to
insert WAL could just result in a temporary drop lock, sleep and
retry.
I would go for (1) waiting for up to checkpoint_timeout then (2), if
we think that is a problem.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2014-01-22 00:19:10 | Re: proposal: hide application_name from other users |
Previous Message | Andres Freund | 2014-01-22 00:17:36 | Re: Hard limit on WAL space used (because PANIC sucks) |