Skip site navigation (1) Skip section navigation (2)

Re: Hot Standby query cancellation and Streaming Replication integration

From: Richard Huxton <dev(at)archonet(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-02-26 14:17:23
Message-ID: 4B87D7F3.1060603@archonet.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 26/02/10 08:33, Greg Smith wrote:
>  There are a number of HS
> tunables that interact with one another, and depending your priorities a
> few ways you can try to optimize the configuration for what I expect to
> be common use cases for this feature.

 > I've written a blog entry at
> http://blog.2ndquadrant.com/en/2010/02/tradeoffs-in-hot-standby-deplo.html
> that tries to explain all that background clearly,

It did too. Thanks for the nice summary people can be pointed at.

> I'm not sure what you might be expecting from the above combination, but
> what actually happens is that many of the SELECT statements on the table
> *that isn't even being updated* are canceled. You see this in the logs:

Hmm - this I'd already figured out for myself. It's just occurred to me 
that this could well be the case between databases too. Database A gets 
vacuumed, B gets its queries kicked off on the standby. Granted lots of 
people just have the one main DB, but even so...

> LOG: restored log file "0000000100000000000000A5" from archive
> ERROR: canceling statement due to conflict with recovery
> DETAIL: User query might have needed to see row versions that must be
> removed.
> STATEMENT: SELECT sum(abalance) FROM pgbench_accounts;
>
> Basically, every time a WAL segment appears that wipes out a tuple that
> SELECT expects should still be visible, because the dead row left behind
> by the update has been vacuumed away, the query is canceled. This
> happens all the time the way I've set this up, and I don't feel like
> this is a contrived demo. Having a long-running query on the standby
> while things get updated and then periodically autovacuumed on the
> primary is going to be extremely common in the sorts of production
> systems I expect want HS the most.

I can pretty much everyone wanting HS+SR. Thousands of small DBs running 
on VMs for a start. Free mostly-live backup? Got to be a winner.

Dumb non-hacker question: why do we cancel all transactions rather than 
just those with "ACCESS SHARE" on the vacuumed table in question? Is it 
the simple fact that we don't know what table this particular section of 
WAL affects, or is it the complexity of tracking all this info?

> If you're running a system that also is using Streaming Replication,
> there is a much better approach possible.

> "Requires keep-alives with timestamps to be added to sync rep feature"
>
> If those keep-alives flowed in both directions, and included both
> timestamps *and* xid visibility information, the master could easily be
> configured to hold open xid snapshots needed for long running queries on
> the standby when that was necessary.

Presumably meaning we need *another* config setting to prevent excessive 
bloat on a heavily updated table on the master.

-- 
   Richard Huxton
   Archonet Ltd

In response to

Responses

pgsql-hackers by date

Next:From: Richard HuxtonDate: 2010-02-26 14:24:30
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Previous:From: Heikki LinnakangasDate: 2010-02-26 14:10:26
Subject: Re: Hot Standby query cancellation and Streaming Replication integration

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group