Re: Hot Standby query cancellation and Streaming Replication integration

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-02-27 03:40:57
Message-ID: 407d949e1002261940n43c521d6i1bb3220997886e09@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 27, 2010 at 2:43 AM, Greg Smith <greg(at)2ndquadrant(dot)com> wrote:
>
> But if you're running the 8 hour report on the master right now, aren't you
> already exposed to a similar pile of bloat issues while it's going?  If I
> have the choice between "sometimes queries will get canceled" vs. "sometimes
> the master will experience the same long-running transaction bloat issues as
> in earlier versions even if the query runs on the standby", I feel like
> leaning toward the latter at least leads to a problem people are used to.

If they move from running these batch queries on the master to running
them on the slave then sure, the situation will be no worse than
before.

But if they move from having a plain old PITR warm standby to having
one they can run queries on they might well assume that the big
advantage of having the standby to play with is precisely that they
can do things there that they have never been able to do on the master
previously without causing damage.

I agree that having queries randomly and unpredictably canceled is
pretty awful. My argument was that max_standby_delay should default to
infinity on the basis that any other value has to be picked based on
actual workloads and SLAs.

My feeling is that there are probably only two types of configurations
that make sense, a HA replica with a low max_standby_delay or a
reporting replica with a high max_standby_delay. Any attempt to
combine the two on the same system will only work if you know your
application well and can make compromises with both.

I would also like to be able to handle load balancing read-only
queries but that will be really tricky. You want up-to-date data and
you want to be able to run moderately complex queries. That kind of
workload may well require synchronous replication to really work
properly.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Дмитрий Фефелов 2010-02-27 03:59:01 Re: Alpha4 Available Now
Previous Message Bruce Momjian 2010-02-27 03:38:49 Re: ALTER ROLE/DATABASE RESET ALL versus security