Re: pg_promote() can cause busy loop

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_promote() can cause busy loop
Date: 2019-09-05 01:53:19
Message-ID: CAHGQGwGHmNTZmf7fWFMHgJWq6vZLbDk_XR+Jegfsf=vCDdPo+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 5, 2019 at 10:26 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Thu, Sep 05, 2019 at 09:46:26AM +0900, Fujii Masao wrote:
> > I found small issue in pg_promote(). If postmaster dies
> > while pg_promote() is waiting for the standby promotion to finish,
> > pg_promote() can cause busy loop. This happens because
> > pg_promote() does nothing when WaitLatch() detects
> > the postmaster death event. I think that pg_promote()
> > should bail out of the loop immediately in that case.
> >
> > Attached is the patch for the fix.
>
> Indeed, this is not correct.
>
> - ereport(WARNING,
> - (errmsg("server did not promote within %d seconds",
> - wait_seconds)));
> + if (i >= WAITS_PER_SECOND * wait_seconds)
> + ereport(WARNING,
> + (errmsg("server did not promote within %d seconds", wait_seconds)));
>
> Would it make more sense to issue a warning mentioning the postmaster
> death and then return PG_RETURN_BOOL(false) instead of breaking out of
> the loop? It could be confusing to warn about a timeout if the
> postmaster died in parallel, and we know the actual reason why the
> promotion did not happen in this case.

It's ok to use PG_RETURN_BOOL(false) instead of breaking out of the loop
in that case. Which would make the code simpler.

But I don't think it's worth warning about postmaster death here
because a backend will emit FATAL message like "terminating connection
due to unexpected postmaster exit" in secure_read() after
pg_promote() returns false.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-09-05 02:07:25 Re: block-level incremental backup
Previous Message Michael Paquier 2019-09-05 01:41:31 Re: [HACKERS] [PATCH] pageinspect function to decode infomasks