Re: Standby accepts recovery_target_timeline setting?

From: David Steele <david(at)pgmasters(dot)net>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: Standby accepts recovery_target_timeline setting?
Date: 2019-09-26 17:58:16
Message-ID: 5e6537c7-d10e-6a67-4813-bbd7455cfaf5@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/26/19 5:55 AM, Fujii Masao wrote:
> On Thu, Sep 26, 2019 at 5:22 AM David Steele <david(at)pgmasters(dot)net> wrote:
>>
>> While testing against PG12 I noticed the documentation states that
>> recovery targets are not valid when standby.signal is present.
>
> Or that description in the doc is not true? Other recovery target
> parameters seem to take effect even when standby.signal exists.

Yes, and this is true for or any combination of recovery.signal and
standby signal as far as I can see. We have been tracking down some
strange behaviors over the last few days as we have been adding PG12
support to pgBackRest. Late in the day I know, but we just got the
relevant code migrated to C and we did not fancy coding it twice.

The main thing is if you set recovery_target_time in
postgresql.auto.conf then recovery will always try to hit that target
with any combination of recovery.signal and standby.signal. But
target_action is only active when recovery.signal, standby.signal, or
both are present.

All these tests were done on 12rc1.

So given this postgresql.auto.conf:

recovery_target_time = '2019-09-26 14:39:51.280711+00'
restore_command = 'cp /home/vagrant/test/archive/%f "%p"'
recovery_target_timeline = current
recovery_target_action = promote

And these settings added to postgresql.conf:

wal_level = replica
archive_mode = on
archive_command = 'test ! -f /home/vagrant/test/archive/%f && cp %p
/home/vagrant/test/archive/%f'

And this backup_label:

START WAL LOCATION: 0/2000028 (file 000000010000000000000002)
CHECKPOINT LOCATION: 0/2000060
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2019-09-26 14:39:49 UTC
LABEL: pg_basebackup base backup
START TIMELINE: 1

The backup we are recovering contains a table that exists at the target
time but is dropped after that as an additional confirmation. In all
the recovery scenarios below the table exists after recovery.

Here's what recovery looks like with recovery.signal:

2019-09-26 14:49:52.758 UTC [25353] LOG: database system was
interrupted; last known up at 2019-09-26 14:39:49 UTC
2019-09-26 14:49:52.824 UTC [25353] LOG: starting point-in-time
recovery to 2019-09-26 14:39:51.280711+00
2019-09-26 14:49:52.836 UTC [25353] LOG: restored log file
"000000010000000000000002" from archive
2019-09-26 14:49:52.885 UTC [25353] LOG: redo starts at 0/2000028
2019-09-26 14:49:52.894 UTC [25353] LOG: consistent recovery state
reached at 0/2000100
2019-09-26 14:49:52.894 UTC [25352] LOG: database system is ready to
accept read only connections
2019-09-26 14:49:52.905 UTC [25353] LOG: restored log file
"000000010000000000000003" from archive
2019-09-26 14:49:52.940 UTC [25353] LOG: recovery stopping before
commit of transaction 487, time 2019-09-26 14:39:54.981557+00
2019-09-26 14:49:52.940 UTC [25353] LOG: redo done at 0/30096A0
cp: cannot stat '/home/vagrant/test/archive/00000002.history': No such
file or directory
2019-09-26 14:49:52.943 UTC [25353] LOG: selected new timeline ID: 2
2019-09-26 14:49:52.998 UTC [25353] LOG: archive recovery complete
2019-09-26 14:49:52.998 UTC [25353] LOG: database system is ready to
accept connections

This is completely normal and what you would expect.

Now without recovery.signal from a fresh restore:

2019-09-26 14:52:29.491 UTC [25409] LOG: database system was
interrupted; last known up at 2019-09-26 14:39:49 UTC
2019-09-26 14:52:29.574 UTC [25409] LOG: restored log file
"000000010000000000000002" from archive
2019-09-26 14:52:29.622 UTC [25409] LOG: redo starts at 0/2000028
2019-09-26 14:52:29.631 UTC [25409] LOG: consistent recovery state
reached at 0/2000100
2019-09-26 14:52:29.642 UTC [25409] LOG: restored log file
"000000010000000000000003" from archive
2019-09-26 14:52:29.666 UTC [25409] LOG: recovery stopping before
commit of transaction 487, time 2019-09-26 14:39:54.981557+00
2019-09-26 14:52:29.666 UTC [25409] LOG: redo done at 0/30096A0
2019-09-26 14:52:29.716 UTC [25408] LOG: database system is ready to
accept connections

Now there is no "starting point-in-time recovery" message but we are
still stopping in the same place, "recovery stopping before commit of
transaction 487". There is no promotion so now we are now logging on
timeline 1 (so there are duplicate WAL errors as soon as archive_command
runs). In PG < 12 you could do this by shutting down, removing
recovery.conf and restarting, but it is now much easier to end up on the
same timeline.

Now with with standby.signal only from a fresh restore:

2019-09-26 14:59:36.889 UTC [25522] LOG: database system was
interrupted; last known up at 2019-09-26 14:39:49 UTC
2019-09-26 14:59:36.983 UTC [25522] LOG: entering standby mode
2019-09-26 14:59:36.994 UTC [25522] LOG: restored log file
"000000010000000000000002" from archive
2019-09-26 14:59:37.038 UTC [25522] LOG: redo starts at 0/2000028
2019-09-26 14:59:37.047 UTC [25522] LOG: consistent recovery state
reached at 0/2000100
2019-09-26 14:59:37.047 UTC [25521] LOG: database system is ready to
accept read only connections
2019-09-26 14:59:37.061 UTC [25522] LOG: restored log file
"000000010000000000000003" from archive
2019-09-26 14:59:37.093 UTC [25522] LOG: recovery stopping before
commit of transaction 487, time 2019-09-26 14:39:54.981557+00
2019-09-26 14:59:37.093 UTC [25522] LOG: redo done at 0/30096A0
cp: cannot stat '/home/vagrant/test/archive/00000002.history': No such
file or directory
2019-09-26 14:59:37.097 UTC [25522] LOG: selected new timeline ID: 2
2019-09-26 14:59:37.270 UTC [25522] LOG: archive recovery complete
cp: cannot stat '/home/vagrant/test/archive/00000001.history': No such
file or directory
2019-09-26 14:59:37.338 UTC [25521] LOG: database system is ready to
accept connections

The cluster starts in standby mode, hits the recovery target,
then promotes even though no recovery.signal is present.

And finally with both recovery.signal and standby.signal you get the
same as with standby.signal only.

I was able to get the same results using an xid target:

recovery_target_xid = 487
recovery_target_inclusive = false

All of this is roughly analogous to use cases that were possible
before, but there were fewer permutations then. You had no standby and
no recovery target without recovery.conf so "recovery.signal" was always
there, more or less.

At the very least, according to the docs, none of the target options are
supposed to be active unless recovery.signal is in place. Since
outdated entries in postgresql.auto.conf can have effect even in
the absence of recovery.signal, it seems pretty important to make sure
that mechanism is working correctly - or that the caveat is clearly
documented.

I do think this issue needs to be addressed before GA.

Fujii -- I just became aware of your email at [1] so I'll respond to
that as well.

--
-David
david(at)pgmasters(dot)net

[1]
https://www.postgresql.org/message-id/CAHGQGwEYYg_Ng%2B03FtZczacCpYgJ2Pn%3DB_wPtWF%2BFFLYDgpa1g%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-09-26 18:07:52 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Pavel Stehule 2019-09-26 17:39:31 Re: dropdb --force