Re: recovery modules

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: recovery modules
Date: 2023-01-27 23:28:21
Message-ID: 20230127232821.GA2221918@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 26, 2023 at 09:40:58PM -0800, Nathan Bossart wrote:
> On Wed, Jan 25, 2023 at 04:34:21PM +0900, Michael Paquier wrote:
>> The loop part is annoying.. I've never been a fan of adding this
>> cross-value checks for the archiver modules in the first place, and it
>> would make things much simpler in the checkpointer if we need to think
>> about that as we want these values to be reloadable. Perhaps this
>> could just be an exception where we just give priority on one over the
>> other archive_cleanup_command? The startup process has a well-defined
>> sequence after a failure, while the checkpointer is designed to remain
>> robust.
>
> Yeah, there are some problems here. If we ERROR, we'll just bounce back to
> the sigsetjmp() block once a second, and we'll never pick up configuration
> reloads, shutdown signals, etc. If we FATAL, we'll just rapidly restart
> over and over. Given the dicussion about misconfigured archiving
> parameters [0], I doubt folks will be okay with giving priority to one or
> the other.
>
> I'm currently thinking that the checkpointer should set a flag and clear
> the recovery callbacks when a misconfiguration is detected. Anytime the
> checkpointer tries to use the archive-cleanup callback, a WARNING would be
> emitted. This is similar to an approach I proposed for archiving
> misconfigurations (that we didn't proceed with) [1]. Given the
> aformentioned problems, this approach might be more suitable for the
> checkpointer than it is for the archiver.

The more I think about this, the more I wonder whether we really need to
include archive_cleanup_command and recovery_end_command in recovery
modules. Another weird thing with the checkpointer is that the
restore_library will stay loaded long after recovery is finished, and it'll
be loaded regardless of whether recovery is required in the first place.
Of course, that typically won't cause any problems, and we could wait until
we need to do archive cleanup to load the library (and call its shutdown
callback when recovery is finished), but this strikes me as potentially
more complexity than the feature is worth. Perhaps we should just focus on
covering the restore_command functionality for now and add the rest later.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-27 23:30:35 Re: heapgettup() with NoMovementScanDirection unused in core?
Previous Message Tom Lane 2023-01-27 23:28:16 Re: Optimizing PostgreSQL with LLVM's PGO+LTO