Re: Hard limit on WAL space used (because PANIC sucks)

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2014-01-22 13:58:18
Message-ID: 1390399098.58583.YahooMailNeo@web122305.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Well, PANIC is certainly bad, but what I'm suggesting is that we
> just focus on getting that down to ERROR and not worry about
> trying to get out of the disk-shortage situation automatically.
> Nor do I believe that it's such a good idea to have the database
> freeze up until space appears rather than reporting errors.

Dusting off my DBA hat for a moment, I would say that I would be
happy if each process either generated an ERROR or went into a wait
state.  They would not all need to do the same thing; whichever is
easier in each process's context would do.  It would be nice if a
process which went into a long wait state until disk space became
available would issue a WARNING about that, where that is possible.

I feel that anyone running a database that matters to them should
be monitoring disk space and getting an alert from the monitoring
system in advance of actually running out of disk space; but we all
know that poorly managed systems are out there.  To accomodate them
we don't want to add risk or performance hits for those who do
manage their systems well.

The attempt to make more space by deleting WAL files scares me a
bit.  The dynamics of that seem like they could be
counterproductive if the pg_xlog directory is on the same
filesystem as everything else and there is a rapidly growing file.
Every time you cleaned up, the fast-growing file would eat more of
the space pg_xlog had been using, until perhaps it had no space to
keep even a single segment, no?  And how would that work with a
situation where the archive_command was failing, causing a build-up
WAL files?  It just seems like too much mechanism and incomplete
coverage of the problem space.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-22 14:15:01 Re: ALTER SYSTEM SET typos and fix for temporary file name management
Previous Message Alvaro Herrera 2014-01-22 13:51:50 Re: GIN pending list pages not recycled promptly (was Re: GIN improvements part 1: additional information)