Re: Hot Standby query cancellation and Streaming Replication integration

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Greg Smith <greg(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby query cancellation and Streaming Replication integration
Date: 2010-02-26 18:47:09
Message-ID: 201002261847.o1QIl9Z29565@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> > How to handle situations where the standby goes away for a while,
> > such as a network outage, so that it doesn't block the master from ever
> > cleaning up dead tuples is a concern.
>
> Yeah, that's another issue that needs to be dealt with. You'd probably
> need some kind of a configurable escape valve in the master, to let it
> ignore a standby's snapshot once it gets too old.
>
> > But I do know that the current Hot Standby implementation is going to be
> > frustrating to configure correctly for people.
>
> Perhaps others who are not as deep into the code as I am will have a
> better view on this, but I seriously don't think that's such a big
> issue. I think the max_standby_delay setting is quite intuitive and easy
> to explain. Sure, it would better if there was no tradeoff between
> killing queries and stalling recovery, but I don't think it'll be that
> hard to understand the tradeoff.

Let's look at the five documented cases of query conflict (from our manual):

1 Access Exclusive Locks from primary node, including both explicit
LOCK commands and various DDL actions

2 Dropping tablespaces on the primary while standby queries are
using those tablespaces for temporary work files (work_mem
overflow)

3 Dropping databases on the primary while users are connected to
that database on the standby.

4 The standby waiting longer than max_standby_delay to acquire a
buffer cleanup lock.

5 Early cleanup of data still visible to the current query's
snapshot

We might have a solution to #1 by only cancelling queries that try to
take locks.

#2 and #3 seem like rare occurances.

#4 can be controlled by max_standby_delay, where a large value only
delays playback during crash recovery --- again, a rare occurance.

#5 could be handled by using vacuum_defer_cleanup_age on the master.

Why is vacuum_defer_cleanup_age not listed in postgresql.conf?

In summary, I think passing snapshots to the master is not something
possible for 9.0, and ideally we will never need to add that feature.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-02-26 18:53:16 Re: Re: Hot Standby query cancellation and Streaming Replication integration
Previous Message Gokulakannan Somasundaram 2010-02-26 18:30:51 Re: A thought on Index Organized Tables