Re: Different behaviour for pg_ctl --wait between pg9.5 and pg10

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg k <gregg(dot)kay(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Different behaviour for pg_ctl --wait between pg9.5 and pg10
Date: 2018-03-19 13:48:15
Message-ID: 20277.1521467295@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Greg k <gregg(dot)kay(at)gmail(dot)com> writes:
> I have a script where after a point-in-time recovery I run
> "pg_ctl start -D /data -w -t 86400"
> and then try to connect as soon as pg_ctl finishes. With Postgres 9.5.9 (on
> Centos 7.4) I can connect at the end. But with Postgres 10.3 I get a
> connection error
> psql: FATAL: the database system is starting up
> It seems with Postgres 10.3 the postmaster.pid file state goes from
> 'starting' to 'standby' to 'ready' but pg_ctl is saying the server is ready
> to accept connections even though the postmaster.pid file says 'standby'.

The problem from pg_ctl's standpoint is that it can't tell whether
"standby" is a short-lived state. In PG 10 it assumes not, so it exits
once that state is reached. The previous implementation couldn't
distinguish that state at all (because PQping doesn't) and would therefore
wait until the server accepted connections or it timed out. That happened
to be good for your use-case, I guess, but a lot of other people did not
like it: it led to waiting till timeout, and then reporting failure, when
starting a non-hot standby server. Even in the PITR case, there's no
certainty that the server will exit "standby" state in any prompt fashion
(ie, before pg_ctl times out), so that you did not really have a guarantee
before that the server would accept connections after pg_ctl exited.

I'd suggest adding a wait-for-psql-to-connect loop after your pg_ctl
start when starting a PITR run.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2018-03-19 14:26:12 Re: BUG #15121: Multiple UBSAN errors
Previous Message Martin Liška 2018-03-19 09:04:32 Re: BUG #15121: Multiple UBSAN errors