Re: Hard limit on WAL space used (because PANIC sucks)

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2013-06-09 05:36:00
Message-ID: 51B41440.5060608@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/06/2013 10:00 PM, Heikki Linnakangas wrote:
>
> I've seen a case, where it was even worse than a PANIC and shutdown.
> pg_xlog was on a separate partition that had nothing else on it. The
> partition filled up, and the system shut down with a PANIC. Because
> there was no space left, it could not even write the checkpoint after
> recovery, and thus refused to start up again. There was nothing else
> on the partition that you could delete to make space. The only
> recourse would've been to add more disk space to the partition
> (impossible), or manually delete an old WAL file that was not needed
> to recover from the latest checkpoint (scary). Fortunately this was a
> test system, so we just deleted everything.

There were a couple of dba.stackexchange.com reports along the same
lines recently, too. Both involved an antivirus vendor's VM appliance
with a canned (stupid) configuration that set wal_keep_segments too high
for the disk space allocated and stored WAL on a separate partition.

People are having issues with WAL space management in the real world and
I think it and autovacuum are the two hardest things for most people to
configure and understand in Pg right now.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2013-06-09 05:44:32 Re: Hard limit on WAL space used (because PANIC sucks)
Previous Message Craig Ringer 2013-06-09 05:30:26 Re: Redesigning checkpoint_segments