Re: BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while

From: Greg Clough <greg(at)gclough(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while
Date: 2015-12-30 13:05:30
Message-ID: CADjwvTPz+dH_SwSd3mqEDTPUfU3Oq3mFAppjkXay83Kqnj9Wjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Michael,

I figured that if the database was in a consistent state when it was
shutdown with a 1 hour delay, then when it was restarted with a new 24 hour
delay then it would also be consistent and thus available for read-only
work. In my test case, no transactions have occurred but the standby still
wasn't accessible. I think this was compounded, as I think the error
itself a bit confusing:

psql: FATAL: the database system is starting up

I believe the unavailability of the standby for extended periods if the
recovery_min_apply_delay is increased will create some confusion, just as
it confused me initially. I can see two schools of thought when the
parameter is increased:

1. The standby includes transactions that are newer than the
"recovery_min_apply_delay" setting, so it must be prevented from exposing
any data to ensure applications don't see data too soon.

2. The standby is consistent, so it should be available for read-only
queries... but any new WAL should not be applied until the commit time is >
recovery_min_apply_delay.

Obviously I fall into camp #2, where I expected the database to basically
pause application of WAL until it passed the recovery_min_apply_delay...
but still be available for read-only queries.

If the preferred option is #1, then could we introduce a new error message
so that it's a bit more communicative. Maybe something like:

psql: FATAL: the database system has transactions newer than
recover_min_apply_delay. Waiting...

... or something better of your choosing.

Regards.
Greg.

On 26 December 2015 at 13:45, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Mon, Dec 21, 2015 at 7:44 PM, Greg Clough <greg(at)gclough(dot)com> wrote:
> > I didn't see any response to this, and I do believe it's an actual
> > PostgreSQL bug... or at least a case of undesirable undocumented
> behaviour.
> >
> > SHORT SUMMARY: If you are using a Standby with recovery_min_apply_delay
> =
> > 1h and you increase it to 24h, then your Standby will be unavailable for
> the
> > next 23h.
> >
> > Could someone confirm my diagnosis?
>
> When recovery_min_apply_delay is set, replay should wait for the delay
> defined in the case of a COMMIT or COMMIT PREPARED record, and this
> even if the minimum recovery point ensuring that standby is in a
> consistent state on is not reached. In short, if there is a COMMIT
> before the standby thinks that it has reached a consistent state to
> allow read-only queries, which is what you are looking for, your
> application won't be able to connect to the standby, and the feature
> behaves correctly.
>
> Depending on the use cases, it may be interesting to have a switch
> allowing to not apply the delay should a consistent point not be
> reached though...
> --
> Michael
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2015-12-30 13:44:17 Re: BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while
Previous Message Andres Freund 2015-12-30 11:49:14 Re: BUG #13822: Slave terminated - WAL contains references to invalid page