Re: archive modules loose ends

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: archive modules loose ends
Date: 2023-11-21 04:30:44
Message-ID: 20231121043044.GA3521465@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 13, 2023 at 03:35:28PM -0800, Andres Freund wrote:
> On 2023-11-13 16:42:31 -0600, Nathan Bossart wrote:
>> There seems to be no interest in this patch, so I plan to withdraw it from
>> the commitfest system by the end of the month unless such interest
>> materializes.
>
> I think it might just have arrived too shortly before the feature freeze to be
> worth looking at at the time, and then it didn't really re-raise attention
> until now. I'm so far behind on keeping up with the list that I rarely end up
> looking far back for things I'd like to have answered... Sorry.

No worries. I appreciate the review.

>> I see two main options for dealing with this. One option is to simply have
>> shell_archive (and any other archive modules out there) maintain its own
>> memory context like basic_archive does. This ends up requiring a whole lot
>> of duplicate code between the two built-in modules, though. Another option
>> is to have the archiver manage a memory context that it resets after every
>> invocation of the archiving callback, ERROR or not.
>
> I think passing in a short-lived memory context is a lot nicer to deal with.

Cool.

>> This has the advantage of avoiding code duplication and simplifying things
>> for the built-in modules, but any external modules that rely on palloc'd
>> state being long-lived would need to be adjusted to manage their own
>> long-lived context. (This would need to be appropriately documented.)
>
> Alternatively we could provide a longer-lived memory context in
> ArchiveModuleState, set up by the genric infrastructure. That context would
> obviously still need to be explicitly utilized by a module, but no duplicated
> setup code would be required.

Sure. Right now, I'm not sure there's too much need for that. A module
could just throw stuff in TopMemoryContext, and you probably wouldn't have
any leaks because the archiver just restarts on any ERROR or
archive_library change. But that's probably not a pattern we want to
encourage long-term. I'll jot this down for a follow-up patch idea.

> I think we should just have the AtEOXact_Files() in pgarch.c, then no
> PG_TRY/CATCH is needed here. At the moment I think just about every possible
> use of an archive modules would require using files, so there doesn't seem
> much of a reason to not handle it in pgarch.c.

WFM

> I'd probably reset a few other subsystems at the same time (there's probably
> more):
> - disable_all_timeouts()
> - LWLockReleaseAll()
> - ConditionVariableCancelSleep()
> - pgstat_report_wait_end()
> - ReleaseAuxProcessResources()

I looked around a bit and thought AtEOXact_HashTables() belonged here as
well. I'll probably give this one another pass to see if there's anything
else obvious.

> It could be worth setting up an errcontext providing the module and file
> that's being processed. I personally find that at least as important as
> setting up a ps string detailing the log file... But I guess that could be a
> separate patch.

Indeed. Right now we rely on the module to emit sufficiently-detailed
logs, but it'd be nice if they got that for free.

> It'd be nice to add a comment explaining why pgarch_archiveXlog() is the right
> place to handle errors.

Will do.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2023-11-21 04:31:39 RE: Synchronizing slots from primary to standby
Previous Message Amit Kapila 2023-11-21 04:28:54 Re: Synchronizing slots from primary to standby