Re: Standby accepts recovery_target_timeline setting?

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Standby accepts recovery_target_timeline setting?
Date: 2019-09-28 17:26:02
Message-ID: CAHGQGwHXm9ZAKxqg798yuiZMhp+uPTt-4YSfOGFvEkzFZZRfFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 29, 2019 at 12:51 AM David Steele <david(at)pgmasters(dot)net> wrote:
>
> On 9/28/19 10:54 AM, Fujii Masao wrote:
> > On Sat, Sep 28, 2019 at 2:01 AM David Steele <david(at)pgmasters(dot)net> wrote:
> >> On 9/27/19 11:58 AM, Fujii Masao wrote:
> >>>
> >>> Yes, recovery target settings are used even when neither backup_label
> >>> nor recovery.signal exist, i.e., just a crash recovery, in v12. This is
> >>> completely different behavior from prior versions.
> >>
> >> I'm not able to reproduce this. I only see recovery settings being used
> >> if backup_label, recovery.signal, or standby.signal is present.
> >>
> >> Do you have an example?
> >
> > Yes, here is the example:
> >
> > initdb -D data
> > pg_ctl -D data start
> > psql -c "select pg_create_restore_point('hoge')"
> > psql -c "alter system set recovery_target_name to 'hoge'"
> > psql -c "create table test as select num from generate_series(1, 100) num"
> > pg_ctl -D data -m i stop
> > pg_ctl -D data start
> >
> > After restarting the server at the above final step, you will see
> > the following log messages indicating that recovery stops at
> > recovery_target_name.
> >
> > 2019-09-28 22:42:04.849 JST [16944] LOG: recovery stopping at restore
> > point "hoge", time 2019-09-28 22:42:03.86558+09
> > 2019-09-28 22:42:04.849 JST [16944] FATAL: requested recovery stop
> > point is before consistent recovery point
>
> That's definitely not good behavior.
>
> >>> IMO, since v12 is RC1 now, it's not good idea to change the logic to new.
> >>> So at least for v12, we basically should change the recovery logic so that
> >>> it behaves in the same way as prior versions. That is,
> >>>
> >>> - Stop the recovery with an error if any recovery target is set in
> >>> crash recovery
> >>
> >> This seems reasonable. I tried adding a recovery.signal and
> >> restore_command for crash recovery and I just got an error that it
> >> couldn't find 00000002.history in the archive.
> >
> > You added recovery.signal, so it means that you started an archive recovery,
> > not crash recovery. Right?
>
> Correct.
>
> > Anyway I'm thinking to apply something like attached patch, to emit an error
> > if recovery target is set in crash recovery.
>
> The patch looks reasonable.
>
> >>> - Do not enter an archive recovery mode if recovery.signal is missing
> >>
> >> Agreed. Perhaps it's OK to use restore_command if a backup_label is
> >> present
> >
> > Yeah, it's maybe OK, but differenet behavior from current version.
> > So, at least for v12, I'm inclined to prevent crash recovery with backup_label
> > from using restore_command, i.e., only WAL files in pg_wal will be replayed
> > in this case.
>
> Agreed. Seems like that could be added to the patch above easily
> enough. More checks would be needed to prevent the behaviors I've been
> seeing in the other thread, but it should be possible to more or less
> mimic the old behavior with sufficient checks.

Yeah, more checks would be necessary. IMO easy fix is to forbid not only
recovery target parameters but also any recovery parameters (specified
in recovery.conf in previous versions) in crash recovery.

In v11 or before, any parameters in recovery.conf cannot take effect in
crash recovery because crash recovery always starts without recovery.conf.
But in v12, those parameters are specified in postgresql.conf,
so they may take effect even in crash recovery (i.e., when both
recovery.signal and standby.signal are missing). This would be the root
cause of the problems that we are discussing, I think.

There might be some recovery parameters that we can safely use
even in crash recovery, e.g., maybe recovery_end_command
(now, you can see that recovery_end_command is executed in
crash recovery in v12). But at this stage of v12, it's worth thinking to
just cause crash recovery to exit with an error when any recovery
parameter is set. Thought?

Or if that change is overkill, alternatively we can make crash recovery
"ignore" any recovery parameters, e.g., by forcibly disabling
the parameters.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-09-28 17:34:19 Re: Instability of partition_prune regression test results
Previous Message Jonathan S. Katz 2019-09-28 17:03:32 Re: Document recovery_target_action behavior?