Re: Re: Clarifying "server starting" messaging in pg_ctl start without --wait

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Ryan Murphy <ryanfmurphy(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Clarifying "server starting" messaging in pg_ctl start without --wait
Date: 2017-01-18 13:25:44
Message-ID: 20170118132544.GX18360@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michael,

* Michael Paquier (michael(dot)paquier(at)gmail(dot)com) wrote:
> On Wed, Jan 18, 2017 at 10:35 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
> > On Wed, Jan 18, 2017 at 7:31 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> >> Perhaps we need a way for pg_ctl to realize a cold-standby case and
> >> throw an error or warning if --wait is specified then, but that hardly
> >> seems like the common use-case. It also wouldn't make any sense to have
> >> anything in the init system which depended on PG being up in such a case
> >> because, well, PG isn't ever going to be 'up'.
> >
> > Yeah, it seems to me that we are likely looking for a wait mode saying
> > to exit pg_ctl once Postgres is happily rejecting connections, because
> > that means that it is up and that it is sorting out something first
> > before accepting them. This would basically filter the states in the
> > control file that we find as acceptable if the connection test
> > continues complaining about PQPING_REJECT.
>
> Another option would be as well to log the state of the control file
> to the user to let him know what currently happens, and document that
> increasing the wait timeout is recommended if the recovery time since
> the last redo point takes longer.

I was actually thinking about it the other way- start out by changing
them to both be 5m and then document next to checkpoint_timeout (and
max_wal_size, perhaps...) that if you go changing those parameters (eg:
bumping up checkpoint_timeout to 30 minutes and max_wal_size up enough
that you're still checkpointing based on time and not due to running out
of WAL space) then you might need to consider raising the timeout for
pg_ctl to wait around for the server to finish going through crash
recovery due to all of the outstanding changes since the last
checkpoint.

In particular, I like to think that will help people understand the
downsides of raising those values; I've run into a number of cases where
people seem to feel it's a win-win without any downside, but that isn't
really the case.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-01-18 13:43:24 Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Previous Message Stephen Frost 2017-01-18 13:21:03 Re: Re: Clarifying "server starting" messaging in pg_ctl start without --wait