Re: Re: Hot Standby query cancellation and Streaming Replication integration

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Greg Stark <gsstark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-03-02 18:30:15
Message-ID: 201003021830.o22IUFW28509@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith wrote:
> > I assumed they would set max_standby_delay = -1 and be happy.
> >
>
> The admin in this situation might be happy until the first time the
> primary fails and a failover is forced, at which point there is an
> unbounded amount of recovery data to apply that was stuck waiting behind
> whatever long-running queries were active. I don't know if you've ever
> watched what happens to a pre-8.2 cold standby when you start it up with
> hundreds or thousands of backed up WAL files to process before the
> server can start, but it's not a fast process. I watched a production
> 8.1 standby get >4000 files behind once due to an archive_command bug,
> and it's not something I'd like to ever chew my nails off to again. If
> your goal was HA and you're trying to bring up the standby, the server
> is down the whole time that's going on.
>
> This is why no admin who prioritizes HA would consider
> 'max_standby_delay = -1' a reasonable setting, and those are the sort of
> users Joachim's example was discussing. Only takes one rogue query that
> runs for a long time to make the standby so far behind it's useless for
> HA purposes. And you also have to ask yourself "if recovery is halted
> while waiting for this query to run, how stale is the data on the
> standby getting?". That's true for any large setting for this
> parameter, but using -1 for the unlimited setting also gives the maximum
> possible potential for such staleness.
>
> 'max_standby_delay = -1' is really only a reasonable idea if you are
> absolutely certain all queries are going to be short, which we can't
> dismiss as an unfounded use case so it has value. I would expect you
> have to also combine it with a matching reasonable statement_timeout to
> enforce that expectation to make that situation safer.

Well, as you stated in your blog, you are going to have one of these
downsides:

o master bloat
o delayed recovery
o cancelled queries

Right now you can't choose "master bloat", but you can choose the other
two. I think that is acceptable for 9.0, assuming the other two don't
have the problems that Tom foresees.

Our documentation should probably just come how and state that clearly.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2010-03-02 18:34:33 Re: Re: Hot Standby query cancellation and Streaming Replication integration
Previous Message Bruce Momjian 2010-03-02 18:14:00 Re: [GENERAL] trouble with to_char('L')