Re: Fetching timeline during recovery

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fetching timeline during recovery
Date: 2019-07-24 12:33:27
Message-ID: 20190724143327.1ca2bb9a@firost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Michael,

On Wed, 24 Jul 2019 09:49:05 +0900
Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Tue, Jul 23, 2019 at 06:05:18PM +0200, Jehan-Guillaume de Rorthais wrote:
> > Please, find in attachment a first trivial patch to support
> > pg_walfile_name() and pg_walfile_name_offset() on a standby.
> > Previous restriction on this functions seems related to ThisTimeLineID not
> > being safe on standby. This patch is fetching the timeline from
> > WalRcv->receivedTLI using GetWalRcvWriteRecPtr(). As far as I understand,
> > this is updated each time some data are flushed to the WAL.
[...]
> Your patch does not count for the case of archive recovery, where
> there is no WAL receiver, and as the shared memory state of the WAL
> receiver is not set 0 would be set.

Indeed. I tested this topic with the following query and was fine with the
NULL result:

select pg_walfile_name(pg_last_wal_receive_lsn());

I was fine with this result because my use case requires replication anyway. A
NULL result would mean that the node never streamed from the old primary since
its last startup, so a failover should ignore it anyway.

However, NULL just comes from pg_last_wal_receive_lsn() here. The following
query result is wrong:

> select pg_walfile_name('0/1')
000000000000000000000000

I fixed that. See patch 0001-v2-* in attachement

> The replay timeline is something we could use here instead via
> GetXLogReplayRecPtr(). CreateRestartPoint actually takes the latest WAL
> receiver or replayed point for its end LSN position, whichever is newer.

I did consider GetXLogReplayRecPtr() or even XLogCtl->replayEndTLI (which is
updated right before the replay). However, both depend on read activity on the
standby. That's why I picked WalRcv->receivedTLI which is updated whatever the
reading activity on the standby.

> > Last, I plan to produce an extension to support this on older release. Is
> > it something that could be integrated in official source tree during a minor
> > release or should I publish it on eg. pgxn?
>
> Unfortunately no. This is a behavior change so it cannot find its way
> into back branches.

Yes, my patch is a behavior change. But here, I was yalking about an
extension, not the core itself, to support this feature in older releases.

> The WAL receiver state is in shared memory and published, so that's easy
> enough to get. We don't do that for XLogCtl unfortunately.

Both are in shared memory, but WalRcv have a public function to get its
receivedTLI member.

XLogCtl has nothing in public to expose its ThisTimeLineID member. However, from
a module, I'm able to fetch it using:

XLogCtl = ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &found);
SpinLockAcquire(&XLogCtl->info_lck);
tl = XLogCtl->ThisTimeLineID;
SpinLockRelease(&XLogCtl->info_lck);

As the "XLOG Ctl" index entry already exists in shmem, ShmemInitStruct returns
the correct structure from there. Not sure this was supposed to be used this
way though...Adding a public function might be cleaner, but it will not help
for older releases.

> I think that there are arguments for being more flexible with it, and perhaps
> have a system-level view to be able to look at some of its fields.

Great idea. I'll give it a try to keep the discussion on.

> There is also a downside with get_controlfile(), which is that it
> fetches directly the data from the on-disk pg_control, and
> post-recovery this only gets updated at the first checkpoint.

Indeed, that's why I started this patch and thread.

Thanks,

Attachment Content-Type Size
0001-v2-Support-pg_walfile_name-on-standby.patch text/x-patch 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2019-07-24 12:54:39 Re: Issue in to_timestamp/to_date while handling the quoted literal string
Previous Message Chapman Flack 2019-07-24 12:14:40 Re: benchmarking Flex practices