Re: Hot Standby query cancellation and Streaming Replication integration

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-02-27 02:43:58
Message-ID: 4B8886EE.8010407@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote:
> Well, I think the choice is either you delay vacuum on the master for 8
> hours or pile up 8 hours of WAL files on the slave, and delay
> application, and make recovery much slower. It is not clear to me which
> option a user would prefer because the bloat on the master might be
> permanent.
>

But if you're running the 8 hour report on the master right now, aren't
you already exposed to a similar pile of bloat issues while it's going?
If I have the choice between "sometimes queries will get canceled" vs.
"sometimes the master will experience the same long-running transaction
bloat issues as in earlier versions even if the query runs on the
standby", I feel like leaning toward the latter at least leads to a
problem people are used to.

This falls into the principle of least astonishment category to me.
Testing the final design for how transactions get canceled here led me
to some really unexpected situations, and the downside for a mistake is
"your query is lost". Had I instead discovered that sometimes
long-running transactions on the standby can ripple back to cause a
maintenance slowdown on the master, that's not great. But it would not
have been so surprising, and it won't result in lost query results.

I think people will expect that their queries cancel because of things
like DDL changes. And the existing knobs allow inserting some slack for
things like locks taking a little bit of time to acquire sometimes.
What I don't think people will see coming is that a routine update on an
unrelated table is going to kill a query they might have been waiting
hours for the result of, just because that update crossed an autovacuum
threshold for the other table and introduced a dead row cleanup.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Glaesemann 2010-02-27 02:44:09 Re: Correcting Error message
Previous Message Greg Stark 2010-02-27 02:30:56 Re: Hot Standby query cancellation and Streaming Replication integration