Re: Weird failure with latches in curculio on v15

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Fujii Masao <fujii(at)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Weird failure with latches in curculio on v15
Date: 2023-02-01 17:58:06
Message-ID: 20230201175806.GA3199959@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 01, 2023 at 08:58:01AM -0800, Andres Freund wrote:
> On 2023-02-01 10:12:26 -0500, Tom Lane wrote:
>> The fundamental issue is that we have no good way to break out
>> of system(), and I think the original idea was that
>> in_restore_command would be set *only* for the duration of the
>> system() call. That's clearly been lost sight of completely,
>> but maybe as a stopgap we could try to get back to that.
>
> We could push the functions setting in_restore_command down into
> ExecuteRecoveryCommand(). But I don't think that'd end up necessarily
> being right either - we'd now use the mechanism in places we previously
> didn't (cleanup/end commands).

Right, we'd only want to set it for restore_command. I think that's
doable.

> And there's just plenty other stuff in the 14bdb3f13de 9a740f81eb0 that
> doesn't look right:
> - We now have two places open-coding what BuildRestoreCommand did

This was done because BuildRestoreCommand() had become a thin wrapper
around replace_percent_placeholders(). I can add it back if you don't
think this was the right decision.

> - I'm doubtful that the new shell_* functions are the base for a good
> API to abstract restoring files

Why?

> - the error message for a failed restore command seems to have gotten
> worse:
> could not restore file \"%s\" from archive: %s"
> ->
> "%s \"%s\": %s", commandName, command

Okay, I'll work on improving this message.

> - shell_* imo is not a good namespace for something called from xlog.c,
> xlogarchive.c. I realize the intention is that shell_archive.c is
> going to be its own "restore module", but for now it imo looks odd

What do you propose instead? FWIW this should go away with recovery
modules. This is just an intermediate state to simplify those patches.

> - The comment moved out of RestoreArchivedFile() doesn't seems less
> useful at its new location

Where do you think it should go?

> - explanation of why we use GetOldestRestartPoint() is halfway lost

Okay, I'll work on adding more context here.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-01 18:18:27 Re: Weird failure with latches in curculio on v15
Previous Message Robert Haas 2023-02-01 17:51:58 Re: Show various offset arrays for heap WAL records