Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
Date: 2021-05-19 07:59:31
Message-ID: CAFiTN-teHyu3cjRkTpnpzR0ujC4edHzch4V++KE=n5bfg71RNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 19, 2021 at 11:55 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote in
> > On Wed, May 19, 2021 at 10:16 AM Fujii Masao
> > <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> > >
> > > On 2021/05/18 15:46, Michael Paquier wrote:
> > > > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote:
> > > >> Currently a promotion causes all available WAL to be replayed before
> > > >> a standby becomes a primary whether it was in paused state or not.
> > > >> OTOH, something like immediate promotion (i.e., standby becomes
> > > >> a primary without replaying outstanding WAL) might be useful for
> > > >> some cases. I don't object to that.
> > > >
> > > > Sounds like a "promotion immediate" mode. It does not sound difficult
> > > > nor expensive to add a small test for that in one of the existing
> > > > recovery tests triggerring a promotion. Could you add one based on
> > > > pg_get_wal_replay_pause_state()?
> > >
> > > You're thinking to add the test like the following?
> > > #1. Pause the recovery
> > > #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused'
> > > #3. Trigger standby promotion
> > > #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused'
> > >
> > > It seems not easy to do the test #4 stably because
> > > pg_get_wal_replay_pause_state() needs to be executed
> > > before the promotion finishes.
> >
> > Even for #2, we can not ensure that whether it will be 'paused' or
> > 'pause requested'.
>
> We often use poll_query_until() to make sure some desired state is
> reached. And, as Michael suggested, the function
> pg_get_wal_replay_pause_state() still works at the time of
> recovery_end_command. So a bit more detailed steps are:

Right, if we are polling for the state change in #2 then that makes sense.

> #0. Equip the server with recovery_end_command that waits for some
> trigger then start the server.
> #1. Pause the recovery
> #2. Wait until pg_get_wal_replay_pause_state() returns 'paused'
> #3. Trigger standby promotion
> #4. Wait until pg_get_wal_replay_pause_state() returns 'not paused'
> #5. Trigger recovery_end_command to let promotion proceed.

+1

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2021-05-19 08:32:36 Re: pgbench test failing on 14beta1 on Debian/i386
Previous Message Michael Paquier 2021-05-19 07:53:18 Re: Bug fix for tab completion of ALTER TABLE ... VALIDATE CONSTRAINT ...