Re: Should the archiver process always make sure that the timeline history files exist in the archive?

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Jimmy Yih <jyih(at)vmware(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Should the archiver process always make sure that the timeline history files exist in the archive?
Date: 2024-01-11 15:08:31
Message-ID: CALDaNm1jxAzLJBMBRJL+8DFO-YVP_9ehmVDhMHSrFaMjf4WBsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 29 Aug 2023 at 06:29, Jimmy Yih <jyih(at)vmware(dot)com> wrote:
>
> Thanks for the insightful response! I have attached an updated patch
> that moves the proposed logic to the end of StartupXLOG where it seems
> more correct to do this. It also helps with backporting (if it's
> needed) since the archiver process only has access to shared memory
> starting from Postgres 14.
>
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > A. The OP suggests archiving the timeline history file for the current
> > timeline every time the archiver starts. However, I don't think we
> > want to keep archiving the same file over and over. (Granted, we're
> > not always perfect at avoiding that..)
>
> With the updated proposed patch, we'll be checking if the current
> timeline history file needs to be archived at the end of StartupXLOG
> if archiving is enabled. If it detects that a .ready or .done file
> already exists, then it won't do anything (which will be the common
> case). I agree though that this may be an excessive check since it'll
> be a no-op the majority of the time. However, it shouldn't execute
> often and seems like a quick safe preventive measure. Could you give
> more details on why this would be too cumbersome?
>
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > B. Given that the steps valid, I concur to what is described in the
> > test script provided: standbys don't really need that history file
> > for the initial TLI (though I have yet to fully verify this). If the
> > walreceiver just overlooks a fetch error for this file, the standby
> > can successfully start. (Just skipping the first history file seems
> > to work, but it feels a tad aggressive to me.)
>
> This was my initial thought as well but I wasn't sure if it was okay
> to overlook the fetch error. Initial testing and brainstorming seems
> to show that it's okay. I think the main bad thing is that these new
> standbys will not have their initial timeline history files which can
> be useful for administration. I've attached a patch that attempts this
> approach if we want to switch to this approach as the solution. The
> patch contains an updated TAP test as well to better showcase the
> issue and fix.
>
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > C. If those steps aren't valid, we might want to add a note stating
> > that -X none basebackups do need the timeline history file for the
> > initial TLI.
>
> The difficult thing about only documenting this is that it forces the
> user to manually store and track the timeline history files. It can be
> a bit cumbersome for WAL archiving users to recognize this scenario
> when they're just trying to optimize their basebackups by using
> -Xnone. But then again -Xnone does seem like it's designed for
> advanced users so this might be okay.
>
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > And don't forget to enable archive mode before the latest timeline
> > switch if any.
>
> This might not be reasonable since a user could've been using
> streaming replication and doing failover/failbacks as part of general
> high availability to manage their Postgres without knowing they were
> going to enable WAL archiving later on. The user would need to
> configure archiving and force a failover which may not be
> straightforward.

I have changed the status of the patch to "Waiting on Author" as
Robert's suggestions have not yet been addressed. Feel free to address
the suggestions and update the status accordingly.

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-01-11 15:10:37 Re: Wrong results with grouping sets
Previous Message Laurenz Albe 2024-01-11 15:05:51 Re: Postgres Partitions Limitations (5.11.2.3)