Re: Weird failure with latches in curculio on v15

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Fujii Masao <fujii(at)postgresql(dot)org>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Weird failure with latches in curculio on v15
Date: 2023-02-25 19:00:31
Message-ID: 20230225190031.e3vesk22q5wpmmhc@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-02-19 20:06:24 +0530, Robert Haas wrote:
> On Sun, Feb 19, 2023 at 2:45 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > To me that seems even simpler? Nothing but the archiver is supposed to create
> > .done files and nothing is supposed to remove .ready files without archiver
> > having created the .done files. So the archiver process can scan
> > archive_status until its done or until N archives have been collected, and
> > then process them at once? Only the creation of the .done files would be
> > serial, but I don't think that's commonly a problem (and could be optimized as
> > well, by creating multiple files and then fsyncing them in a second pass,
> > avoiding N filesystem journal flushes).
> >
> > Maybe I am misunderstanding what you see as the problem?
>
> Well right now the archiver process calls ArchiveFileCB when there's a
> file ready for archiving, and that process is supposed to archive the
> whole thing before returning. That pretty obviously seems to preclude
> having more than one file being archived at the same time. What
> callback structure do you have in mind to allow for that?

TBH, I think the current archive and restore module APIs aren't useful. I
think it was a mistake to add archive modules without having demonstrated that
one can do something useful with them that the restore_command didn't already
do. If anything, archive modules have made it harder to improve archiving
performance via concurrency.

My point was that it's easy to have multiple archive commands in process at
the same time, because we already have a queuing system, and that
archive_command is entire compatible with doing that, because running multiple
subprocesses is pretty trivial. It wasn't that the archive API is suitable for
that.

> I mean, my idea was to basically just have one big callback:
> ArchiverModuleMainLoopCB(). Which wouldn't return, or perhaps, would
> only return when archiving was totally caught up and there was nothing
> more to do right now. And then that callback could call functions like
> AreThereAnyMoreFilesIShouldBeArchivingAndIfYesWhatIsTheNextOne(). So
> it would call that function and it would find out about a file and
> start an HTTP session or whatever and then call that function again
> and start another HTTP session for the second file and so on until it
> had as much concurrency as it wanted. And then when it hit the
> concurrency limit, it would wait until at least one HTTP request
> finished. At that point it would call
> HeyEverybodyISuccessfullyArchivedAWalFile(), after which it could
> again ask for the next file and start a request for that one and so on
> and so forth.

> I don't really understand what the other possible model is here,
> honestly. Right now, control remains within the archive module for the
> entire time that a file is being archived. If we generalize the model
> to allow multiple files to be in the process of being archived at the
> same time, the archive module is going to need to have control as long
> as >= 1 of them are in progress, at least AFAICS. If you have some
> other idea how it would work, please explain it to me...

I don't think that a main loop approach is the only viable one. It might be
the most likely to succeed one though. As an alternative, consider something
like

struct ArchiveFileState {
int fd;
enum WaitFor { READ, WRITE, CONNECT };
void *file_private;
}

typedef bool (*ArchiveFileStartCB)(ArchiveModuleState *state,
ArchiveFileState *file_state,
const char *file, const char *path);

typedef bool (*ArchiveFileContinueCB)(ArchiveModuleState *state,
ArchiveFileState *file_state);

An archive module could open an HTTP connection, do IO until it's blocked, put
the fd in file_state, return. The main loop could do big event loop around all
of the file descriptors and whenever any of FDs signal IO is ready, call
ArchiveFileContinueCB() for that file.

I don't know if that's better than ArchiverModuleMainLoopCB(). I can see both
advantages and disadvantages.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-25 19:07:42 Re: stopgap fix for signal handling during restore_command
Previous Message Gilles Darold 2023-02-25 18:59:47 Re: [Proposal] Allow pg_dump to include all child tables with the root table