Re: Fetching timeline during recovery

From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fetching timeline during recovery
Date: 2019-07-26 08:02:58
Message-ID: 20190726100258.2dca596c@firost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 26 Jul 2019 16:49:53 +0900 (Tokyo Standard Time)
Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:

> Hi.
>
> At Thu, 25 Jul 2019 19:38:08 +0200, Jehan-Guillaume de Rorthais
> <jgdr(at)dalibo(dot)com> wrote in <20190725193808(dot)1648ddc8(at)firost>
> > On Wed, 24 Jul 2019 14:33:27 +0200
> > Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:
> >
> > > On Wed, 24 Jul 2019 09:49:05 +0900
> > > Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > >
> > > > On Tue, Jul 23, 2019 at 06:05:18PM +0200, Jehan-Guillaume de Rorthais
> > > > wrote:
> > [...]
> > > > I think that there are arguments for being more flexible with it, and
> > > > perhaps have a system-level view to be able to look at some of its
> > > > fields.
> > >
> > > Great idea. I'll give it a try to keep the discussion on.
> >
> > After some thinking, I did not find enough data to expose to justify the
> > creation a system-level view. As I just need the current timeline I
> > wrote "pg_current_timeline()". Please, find the patch in attachment.
> >
> > The current behavior is quite simple:
> > * if the cluster is in production, return ThisTimeLineID
> > * else return walrcv->receivedTLI (using GetWalRcvWriteRecPtr)
> >
> > This is really naive implementation. We should probably add some code around
> > the startup process to gather and share general recovery stats. This would
> > allow to fetch eg. the current recovery method, latest xlog file name
> > restored from archives or streaming, its timeline, etc.
> >
> > Any thoughts?
>
> If replay is delayed behind timeline switch point, replay-LSN and
> receive/write/flush LSNs are on different timelines. When
> replica have not reached the new timeline to which alredy
> received file belongs, the fucntion returns wrong file name,
> specifically a name consisting of the latest segment number and
> the older timeline where the segment doesn't belong to.

Indeed.

> We have an LSN reporting function each for several objectives.
>
> pg_current_wal_lsn
> pg_current_wal_insert_lsn
> pg_current_wal_flush_lsn
> pg_last_wal_receive_lsn
> pg_last_wal_replay_lsn

Yes. In fact, my current implementation might be split as:

pg_current_wal_tl: returns TL on a production cluster
pg_last_wal_received_tl: returns last received TL on a standby

If useful, I could add pg_last_wal_replayed_tl. I don't think *insert_tl and
*flush_tl would be useful as a cluster in production is not supposed to
change its timeline during its lifetime.

> But, I'm not sure just adding further pg_last_*_timeline() to
> this list is a good thing..

I think this is a much better idea than mixing different case (production and
standby) in the same function as I did. Moreover, it's much more coherent with
other existing functions.

> The function returns NULL for NULL input (STRICT behavior) but
> returns (NULL, NULL) for undefined timeline. I don't think the
> differene is meaningful.

Unless I'm missing something, nothing
returns "(NULL, NULL)" in 0001-v1-Add-function-pg_current_timeline.patch.

Thank you for your feedback!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-07-26 08:03:31 Re: Add parallelism and glibc dependent only options to reindexdb
Previous Message vignesh C 2019-07-26 07:53:57 Re: block-level incremental backup