Re: BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while

From: Greg Clough <greg(at)gclough(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while
Date: 2015-12-21 10:44:02
Message-ID: CADjwvTMqrqQhyyyrXwf1qHtJ6-U+TZJdGWjd3S5A3PQ3LxT34A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

I didn't see any response to this, and I do believe it's an actual
PostgreSQL bug... or at least a case of undesirable undocumented behaviour.

*SHORT SUMMARY: If you are using a Standby with recovery_min_apply_delay =
1h and you increase it to 24h, then your Standby will be unavailable for
the next 23h.*

Could someone confirm my diagnosis?

Thanks.
Greg Clough.

On 13 November 2015 at 09:02, Greg Clough <greg(at)gclough(dot)com> wrote:

> I may have glossed over how I created the standby, but just for clarity it
> was done with the primary shutdown cleanly:
>
> su - postgres
> export PGPORT=5432
> pg_ctl stop -m fast
> cd /u01/postgres/9.4/5432
> tar cvf - ./data | (cd /u01/postgres/9.4/5433 && tar xvf -)
>
> Regads.
> Greg.
>
> On 11 November 2015 at 15:50, <postgresql(dot)org(at)gclough(dot)com> wrote:
>
>> The following bug has been logged on the website:
>>
>> Bug reference: 13770
>> Logged by: Greg Clough
>> Email address: postgresql(dot)org(at)gclough(dot)com
>> PostgreSQL version: 9.4.5
>> Operating system: Oracle Linux v6.7
>> Description:
>>
>> If you extend the recovery_min_apply_delay on a standby and then restart
>> it,
>> you cannot connect to it for read-only transactions until the most recent
>> transaction time >= delay time. This is problematic if you extend from
>> say
>> 1 hour to 24 hours, as you have to wait almost a day before you can
>> connect
>> to your Standby again.
>>
>> I can't find anything in the documentation to describe this as intended
>> behaviour, and my expectation was that when I extended the delay that the
>> Standby database would come up and be available for read-only
>> transactions,
>> but would simply delay any new transactions until they were older than
>> recovery_min_apply_delay.
>>
>> I'm curious if this was a design decision to stop people getting confused
>> when they access a database before recovery_min_apply_delay has expired
>> and
>> get data that's too recent, or if it was an unintentional consequence of
>> the
>> way it was implemented.
>>
>>
>> Environment:
>>
>> Oracle Linux v6.7
>> PostgreSQL v9.4.5 (Installed via yum)
>>
>> postgres(at)gclough[5432] ~$ uname -a
>> Linux gclough 3.8.13-98.4.1.el6uek.x86_64 #2 SMP Wed Sep 23 18:46:01 PDT
>> 2015 x86_64 x86_64 x86_64 GNU/Linux
>>
>> postgres(at)gclough[5432] ~$ psql --version
>> psql (PostgreSQL) 9.4.5
>>
>>
>> Test Case:
>>
>> 1. Create a Primary database on port 5432
>>
>> export PGDATA=/u01/postgres/9.4/5432/data
>> export PGPORT=5432
>> initdb
>>
>> wal_level = hot_standby # minimal, archive, hot_standby,
>> or
>> logical
>> max_wal_senders = 1 # max number of walsender processes
>>
>> 2. Replicate it to a Standby database on port 5433
>>
>> export PGDATA=/u01/postgres/9.4/5433/data
>> export PGPORT=5433
>>
>> port = 5433 # (change requires restart)
>> wal_level = hot_standby # minimal, archive, hot_standby,
>> or
>> logical
>> max_wal_senders = 1 # max number of walsender processes
>> hot_standby = on # "on" allows queries during
>> recovery
>>
>> 3. On the Standby (Port 5433), introduce a 2min time delay with
>> "recovery_min_apply_delay" in the recovery.conf:
>>
>> standby_mode = 'on'
>> primary_conninfo = 'port=5432 host=localhost user=postgres
>> application_name=port_5433'
>> recovery_min_apply_delay = 2min
>>
>> 4. Check the current lag on the Standby (Port 5433):
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:32:32 GMT 2015
>> receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
>> --------------+-------------+-------------------------------+------------
>> 0/7000000 | 0/7000000 | | t
>> (1 row)
>>
>> 5. Create a small test table on the Primary (Port 5432):
>>
>> CREATE TABLE delay_test (id NUMERIC);
>> INSERT INTO delay_test VALUES (1);
>>
>> 6. Check the application of WAL on the Standby (Port 5433), and note the
>> delay:
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:33:06 GMT 2015
>> receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
>> --------------+-------------+-------------------------------+------------
>> 0/7015DC8 | 0/7015890 | | t
>> (1 row)
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:34:00 GMT 2015
>> receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
>> --------------+-------------+-------------------------------+------------
>> 0/7015E00 | 0/7015890 | | t
>> (1 row)
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:35:01 GMT 2015
>> receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
>> --------------+-------------+-------------------------------+------------
>> 0/7015E00 | 0/7015E00 | 2015-11-11 15:32:53.950376+00 | t
>> (1 row)
>>
>>
>> 7. Reset the time delay on the Standby from 2 to 5 minutes
>>
>> recovery_min_apply_delay = 5min
>>
>> pg_ctl restart -m fast
>>
>> 8. Try to connect to the Standby, and get a failure:
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:35:26 GMT 2015
>> psql: FATAL: the database system is starting up
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:36:01 GMT 2015
>> psql: FATAL: the database system is starting up
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:37:00 GMT 2015
>> psql: FATAL: the database system is starting up
>>
>> 9. Wait 3 minutes... then it works:
>>
>> postgres(at)gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select
>> pg_last_xlog_receive_location() "receive_xlog",
>> pg_last_xlog_replay_location() "replay_xlog",
>> pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"'
>> Wed Nov 11 15:38:00 GMT 2015
>> receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering
>> --------------+-------------+-------------------------------+------------
>> 0/7015ED8 | 0/7015ED8 | 2015-11-11 15:32:53.950376+00 | t
>> (1 row)
>>
>>
>> --
>> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-bugs
>>
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2015-12-21 14:22:59 Re: Cannot log in as newly created user
Previous Message Michael Paquier 2015-12-21 07:26:25 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby