Re: failover vs. read only queries

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: failover vs. read only queries
Date: 2010-07-02 04:13:10
Message-ID: 4843.1278043990@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Fujii Masao wrote:
>> On Thu, Jun 10, 2010 at 5:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> My feeling about it is that if you want fast failover you should not
>>> have your failover target server configured as hot standby at all, let
>>> alone hot standby with a long max_standby_delay. Such a slave could be
>>> very far behind on applying WAL when the crunch comes, and no amount of
>>> query killing will save you from that. Put your long-running standby
>>> queries on a different slave instead.
>>>
>>> We should consider whether we can improve the situation in 9.1, but it
>>> is not a must-fix for 9.0; especially when the correct behavior isn't
>>> immediately obvious.

>> OK. Let's revisit in 9.1.
>>
>> I attached the proposal patch for 9.1. The patch treats max_standby_delay
>> as zero (i.e., cancels all the conflicting queries immediately), ever since
>> the trigger file is created. So we can cause a recovery to end without
>> waiting for any lock held by queries, and minimize the failover time.
>> OTOH, queries which don't conflict with a recovery survive the failover.

> Should this be added to the first 9.1 commitfest?

Not sure ... it seems like proof of concept for a pretty dubious
concept. If you want a slave to be ready for fast failover then you
should not be letting it get far behind the master in the first place.
I think there's some missing piece here, but I'm not quite sure what
to propose.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2010-07-02 06:24:22 Re: get_whatever_oid, part 1: object types with unqualifed names
Previous Message Tom Lane 2010-07-02 04:05:04 Re: No hash join across partitioned tables?