|From:||Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>|
|To:||Michael Paquier <michael(at)paquier(dot)xyz>|
|Cc:||Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, sfrost(at)snowman(dot)net, masao(dot)fujii(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org|
|Subject:||Re: Fetching timeline during recovery|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Fri, 31 Jan 2020 15:12:30 +0900
Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> On Thu, Jan 23, 2020 at 05:54:08PM +0100, Jehan-Guillaume de Rorthais wrote:
> > Please find the new version of the patch in attachment.
> To be honest, I find the concept of this patch confusing.
> pg_stat_wal_receiver is just a one-one mapping with the shared memory
> state of the WAL receiver itself and show data *if and only if* a WAL
> receiver is running and iff it is ready to display any data, so I'd
> rather not change its nature
If you are talking about the pg_stat_wal_receiver view, I don't have a strong
opinion on this anyway as I vote 0 when discussing it. My current patch
doesn't alter its nature.
> and it has nothing to do with the state of WAL being applied by the startup
Indeed, I was feeling this was a bad design to add these columns, as stated in
my last mail. So I withdraw this.
> So this gets a -1 from me.
> Isn't what you are looking for here a different system view which maps
> directly to XLogCtl so as you can retrieve the status of the applied
> WAL at recovery anytime
My main objective is received LSN/TLI. This is kept by WalRcv for streaming.
That's why pg_stat_wal_receiver was the very good place for my need. But again,
you are right, I shouldn't have add the replied bits to it.
> say pg_stat_recovery?
I finally dig this path. I was in the hope we could find something
simpler and lighter, but other solutions we studied so far (thanks all for your
time) were all discarded .
A new pg_stat_get_recovery() view might be useful for various monitoring
purpose. After poking around in the code, it seems the patch would be bigger
than previous solutions, so I prefer discussing the specs first.
At a first glance, I would imagine the following columns as a minimal patch:
* source: stream, archive or pg_wal
* write/flush/replayed LSN
* write/flush/replayed TLI
This has already some heavy impact in the code. Source might be taken from
xlog.c:currentSource, so it should probably be included in XLogCtl to be
accessible from any backend.
As replayed LSN/TLI comes from XLogCtl too, we might probably need a new
dedicated function to gather these fields plus currentSource under the same
Next, write lsn/tli is not accessible from WalRcv, only flush. So either we do
not include it, or we would probably need to replace WalRcv->receivedUpto with
Next, there's no stats about wal shipping recovery. Restoring a WAL from
archive do not increment anything about write/flush LSN/TLI. I wonder if both
wal_receiver stats and WAL shipping stats might be merged together in the same
refactored structure in shmem, as they might share a fair number of field
together? This would be pretty invasive in the code, but I feel it's heavier to
add another new struct in shmem just to track WAL shipping stats whereas WalRcv
already exists there.
Now, I think the following additional field might be useful for monitoring. But
as this is out my my original scope, I prefer discussing how useful this might
* start_time: start time of the current source
* restored_count: total number of WAL restored. We might want to split this
counter to track each method individually.
* last_received_time: last time we received something from the current source
* last_fail_time: last failure time, whatever the source
Thanks for reading up to here!
 even if I still hope the pg_stat_get_wal_receiver might still gather some
more positive vote :)
|Next Message||Alvaro Herrera||2020-02-11 19:09:29||Re: Portal->commandTag as an enum|
|Previous Message||marcelo zen||2020-02-11 17:04:59||Re: Just for fun: Postgres 20?|