Re: Fetching timeline during recovery

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fetching timeline during recovery
Date: 2019-07-24 00:49:05
Message-ID: 20190724004905.GG2059@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 23, 2019 at 06:05:18PM +0200, Jehan-Guillaume de Rorthais wrote:
> Please, find in attachment a first trivial patch to support pg_walfile_name()
> and pg_walfile_name_offset() on a standby.
> Previous restriction on this functions seems related to ThisTimeLineID not
> being safe on standby. This patch is fetching the timeline from
> WalRcv->receivedTLI using GetWalRcvWriteRecPtr(). As far as I understand,
> this is updated each time some data are flushed to the WAL.

FWIW, I don't have any objections to lift a bit the restrictions on
those functions if we can make that reliable enough. Now during
recovery you cannot rely on ThisTimeLineID as you say, per mostly the
following bit in xlog.c (the comment block a little bit up also has
explanations):
/*
* ThisTimeLineID is normally not set when we're still in recovery.
* However, recycling/preallocating segments above needed ThisTimeLineID
* to determine which timeline to install the segments on. Reset it now,
* to restore the normal state of affairs for debugging purposes.
*/
if (RecoveryInProgress())
ThisTimeLineID = 0;

Your patch does not count for the case of archive recovery, where
there is no WAL receiver, and as the shared memory state of the WAL
receiver is not set 0 would be set. The replay timeline is something
we could use here instead via GetXLogReplayRecPtr().
CreateRestartPoint actually takes the latest WAL receiver or replayed
point for its end LSN position, whichever is newer.

> Last, I plan to produce an extension to support this on older release. Is
> it something that could be integrated in official source tree during a minor
> release or should I publish it on eg. pgxn?

Unfortunately no. This is a behavior change so it cannot find its way
into back branches. The WAL receiver state is in shared memory and
published, so that's easy enough to get. We don't do that for XLogCtl
unfortunately. I think that there are arguments for being more
flexible with it, and perhaps have a system-level view to be able to
look at some of its fields.

There is also a downside with get_controlfile(), which is that it
fetches directly the data from the on-disk pg_control, and
post-recovery this only gets updated at the first checkpoint.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jamison, Kirk 2019-07-24 00:58:24 RE: [PATCH] Speedup truncates of relation forks
Previous Message Justin Pryzby 2019-07-24 00:33:43 Re: stress test for parallel workers