Re: patch proposal

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Venkata B Nagothi <nag1010(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch proposal
Date: 2016-08-26 02:30:08
Message-ID: 20160826023008.GH4028@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Venkata B Nagothi (nag1010(at)gmail(dot)com) wrote:
> On Thu, Aug 25, 2016 at 10:59 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I'm not a fan of the "recovery_target" option, particularly as it's only
> > got one value even though it can mean two things (either "immediate" or
> > "not set"), but we need a complete solution before we can consider
> > deprecating it. Further, we could consider making it an alias for
> > whatever better name we come up with.
>
> The new parameter will accept options : "pause", "shutdown" and "promote"
>
> *"promote"*
>
> This option will ensure database starts up once the "immediate" consistent
> recovery point is reached even if it is well before the mentioned recovery
> target point (XID, Name or time).
> This behaviour will be similar to that of recovery_target="immediate" and
> can be aliased.

I don't believe we're really going at this the right way. Clearly,
there will be cases where we'd like promotion at the end of the WAL
stream (as we currently have) even if the recovery point is not found,
but if the new option's "promote" is the same as "immediate" then we
don't have that.

We need to break this down into all the different possible combinations
and then come up with names for the options to define them. I don't
believe a single option is going to be able to cover all of the cases.

The cases which I'm considering are:

recovery target is immediate (as soon as we have consistency)
recovery target is a set point (name, xid, time, whatever)

action to take if recovery target is found
action to take if recovery target is not found

Generally, "action" is one of "promote", "pause", or "shutdown".
Clearly, not all actions are valid for all recovery target cases- in
particular, "immediate" with "recovery target not found" can not support
the "promote" or "pause" options. Otherwise, we can support:

Recovery Target | Found | Action
-----------------|---------|----------
immediate | Yes | promote
immediate | Yes | pause
immediate | Yes | shutdown

immediate | No | shutdown

name/xid/time | Yes | promote
name/xid/time | Yes | pause
name/xid/time | Yes | shutdown

name/xid/time | No | promote
name/xid/time | No | pause
name/xid/time | No | shutdown

We could clearly support this with these options:

recovery_target = immediate, other
recovery_action_target_found = promote, pause, shutdown
recovery_action_target_not_found = promote, pause, shutdown

One question to ask is if we need to support an option for xid and time
related to when we realize that we won't find the recovery target. If
consistency is reached at a time which is later than the recovery target
for time, what then? Do we go through the rest of the WAL and perform
the "not found" action at the end of the WAL stream? If we use that
approach, then at least all of the recovery target types are handled the
same, but I can definitely see cases where an administrator might prefer
an "error" option.

I'd suggest we attempt to support that also.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2016-08-26 02:33:31 [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly
Previous Message Amit Kapila 2016-08-26 02:25:22 Re: RFC: replace pg_stat_activity.waiting with something more descriptive