Re: when the startup process doesn't

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: when the startup process doesn't
Date: 2021-04-20 20:36:42
Message-ID: 20210420203642.nvxvolm37f2k6prj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-04-20 14:56:58 -0400, Tom Lane wrote:
> I wonder though whether we really need authentication here. pg_ping
> already exposes whether the database is up, to anyone who can reach the
> postmaster port at all. Would it be so horrible if the "can't accept
> connections" error message included a detail about "recovery is X%
> done"?

Unfortunately I think something like a percentage is hard to calculate
right now. Even just looking at crash recovery (vs replication or
PITR), we don't currently know where the WAL ends without reading all
the WAL. The easiest thing to return would be something in LSNs or
bytes and I suspect that we don't want to expose either unauthenticated?

I wonder if we ought to occasionally update something like
ControlFileData->minRecoveryPoint on primaries, similar to what we do on
standbys? Then we could actually calculate a percentage, and it'd have
the added advantage of allowing to detect more cases where the end of
the WAL was lost. Obviously we'd have to throttle it somehow, to avoid
adding a lot of fsyncs, but that seems doable?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-04-20 20:54:57 Re: when the startup process doesn't
Previous Message Andres Freund 2021-04-20 20:28:27 Re: when the startup process doesn't