Re: time-delayed standbys

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: time-delayed standbys
Date: 2011-06-30 17:05:48
Message-ID: BANLkTin4OhdSywpOP2n2gt+vt+fTBYBKHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 30, 2011 at 1:00 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 6/30/11 2:00 AM, Simon Riggs wrote:
>>>> Manual (or scripted) intervention is always necessary if you reach disk
>>>> >> 100% full.
>>> >
>>> > Wow, that's a pretty crappy failure mode... but I don't think we need
>>> > to fix it just on account of this patch.  It would be nice to fix, of
>>> > course.
>> How is that different to running out of space in the main database?
>>
>> If I try to pour a pint of milk into a small cup, I don't blame the cup.
>
> I have to agree with Simon here.  ;-)
>
> We can do some things to make this easier for administrators, but
> there's no way to "solve" the problem.  And the things we could do would
> have to be advanced optional modes which aren't on by default, so they
> wouldn't really help the DBA with poor planning skills.  Here's my
> suggestions:
>
> 1) Have a utility (pg_archivecleanup?) which checks if we have more than
> a specific settings's worth of archive_logs, and breaks replication and
> deletes the archive logs if we hit that number.  This would also require
> some way for the standby to stop replicating *without* becoming a
> standalone server, which I don't think we currently have.
>
> 2) Have a setting where, regardless of standby_delay settings, the
> standby will interrupt any running queries and start applying logs as
> fast as possible if it hits a certain number of unapplied archive logs.
>  Of course, given the issues we had with standby_delay, I'm not sure I
> want to complicate it further.
>
> I think we've already fixed the biggest issue in 9.1, since we now have
> a limit on the number of WALs the master will keep if archiving is
> failing ... yes?  That's the only big *avoidable* failure mode we have,
> where a failing standby effectively shuts down the master.

I'm not sure we changed anything in this area for 9.1. Am I wrong?
wal_keep_segments was present in 9.0. Using that instead of archiving
is a reasonable way to bound the amount of disk space that can get
used, at the cost of possibly needing to rebuild the standby if things
get too far behind. Of course, in any version, you could also use an
archive_command that will remove old files to make space if the disk
is full, with the same downside: if the standby isn't done with those
files, you're now in for a rebuild.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-06-30 17:25:13 Re: time-delayed standbys
Previous Message Robert Haas 2011-06-30 17:02:33 Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE