Skip site navigation (1) Skip section navigation (2)

Re: Re: Hot Standby query cancellation and Streaming Replication integration

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-03-02 21:36:52
Message-ID: 201003022136.o22LaqT29573@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Greg Stark wrote:
> On Mon, Mar 1, 2010 at 5:50 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> > I don't think that defer_cleanup_age is a long-term solution. ?But we
> > need *a* solution which does not involve delaying 9.0.
> 
> So I think the primary solution currently is to raise max_standby_age.
> 
> However there is a concern with max_standby_age. If you set it to,
> say, 300s. Then run a 300s query on the slave which causes the slave
> to fall 299s behind. Now you start a new query on the slave -- it gets
> a snapshot based on the point in time that the slave is currently at.
> If it hits a conflict it will only have 1s to finish before the
> conflict causes the query to be cancelled.
> 
> In short in the current setup I think there is no safe value of
> max_standby_age which will prevent query cancellations short of -1. If
> the slave has a constant stream of queries and always has at least one
> concurrent query running then it's possible that the slave will run
> continuously max_standby_age-epsilon behind the master and cancel
> queries left and right, regardless of how large max_standby_age is.
> 
> To resolve this I think you would have to introduce some chance for
> the slave to catch up. Something like refusing to use a snapshot older
> than max_standby_age/2  and instead wait until the existing queries
> finish and the slave gets a chance to catch up and see a more recent
> snapshot. The problem is that this would result in very unpredictable
> and variable response times from the slave. A single long-lived query
> could cause replay to pause for a big chunk of max_standby_age and
> prevent any new query from starting.

That is a good point. I have added the attached documentation patch to
mention that max_standby_delay increases the master/slave inconsistency,
and not to use it for xid-keepalive connections.

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do

Attachment: /rtmp/diff
Description: text/x-diff (2.4 KB)

In response to

pgsql-hackers by date

Next:From: Christopher BrowneDate: 2010-03-02 23:22:02
Subject: Re: caracara failing to bind to localhost?
Previous:From: Simon RiggsDate: 2010-03-02 21:00:24
Subject: Re: Re: Hot Standby query cancellation and Streaming Replication integration

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group