Re: Hard limit on WAL space used (because PANIC sucks)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2014-01-21 23:24:39
Message-ID: 19736.1390346679@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> My preference would be that we simply start failing writes with ERRORs
>> rather than PANICs. I'm not real sure ATM why this has to be a PANIC
>> condition. Probably the cause is that it's being done inside a critical
>> section, but could we move that?

> My understanding is that if it runs out of buffer space while in an
> XLogInsert, it will be holding one or more buffer content locks
> exclusively, and unless it can complete the xlog (or scrounge up the info
> to return that buffer to its previous state), it can never release that
> lock. There might be other paths were it could get by with an ERROR, but
> if no one can write xlog anymore, all of those paths must quickly converge
> to the one that cannot simply ERROR.

Well, the point is we'd have to somehow push detection of the problem
to a point before the critical section that does the buffer changes
and WAL insertion.

The first idea that comes to mind is (1) estimate the XLOG space needed
(an overestimate is fine here); (2) just before entering the critical
section, call some function to "reserve" that space, such that we always
have at least sum(outstanding reservations) available future WAL space;
(3) release our reservation as part of the actual XLogInsert call.

The problem here is that the "reserve" function would presumably need an
exclusive lock, and would be about as much of a hot spot as XLogInsert
itself is. Plus we'd be paying a lot of extra cycles to solve a corner
case problem that, with all due respect, comes up pretty darn seldom.
So probably we need a better idea than that.

Maybe we could get some mileage out of the fact that very approximate
techniques would be good enough. For instance, I doubt anyone would bleat
if the system insisted on having 10MB or even 100MB of future WAL space
always available. But I'm not sure exactly how to make use of that
flexibility.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-01-21 23:38:48 Re: Hard limit on WAL space used (because PANIC sucks)
Previous Message Marko Tiikkaja 2014-01-21 23:21:44 Re: new json funcs