Re: Weird failure with latches in curculio on v15

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Fujii Masao <fujii(at)postgresql(dot)org>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Weird failure with latches in curculio on v15
Date: 2023-02-19 14:36:24
Message-ID: CA+TgmoaLCxrdHPSnRLD1j1FQQx4k7QSJq-ybVOYj2aEF34sQLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 19, 2023 at 2:45 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> To me that seems even simpler? Nothing but the archiver is supposed to create
> .done files and nothing is supposed to remove .ready files without archiver
> having created the .done files. So the archiver process can scan
> archive_status until its done or until N archives have been collected, and
> then process them at once? Only the creation of the .done files would be
> serial, but I don't think that's commonly a problem (and could be optimized as
> well, by creating multiple files and then fsyncing them in a second pass,
> avoiding N filesystem journal flushes).
>
> Maybe I am misunderstanding what you see as the problem?

Well right now the archiver process calls ArchiveFileCB when there's a
file ready for archiving, and that process is supposed to archive the
whole thing before returning. That pretty obviously seems to preclude
having more than one file being archived at the same time. What
callback structure do you have in mind to allow for that?

I mean, my idea was to basically just have one big callback:
ArchiverModuleMainLoopCB(). Which wouldn't return, or perhaps, would
only return when archiving was totally caught up and there was nothing
more to do right now. And then that callback could call functions like
AreThereAnyMoreFilesIShouldBeArchivingAndIfYesWhatIsTheNextOne(). So
it would call that function and it would find out about a file and
start an HTTP session or whatever and then call that function again
and start another HTTP session for the second file and so on until it
had as much concurrency as it wanted. And then when it hit the
concurrency limit, it would wait until at least one HTTP request
finished. At that point it would call
HeyEverybodyISuccessfullyArchivedAWalFile(), after which it could
again ask for the next file and start a request for that one and so on
and so forth.

I don't really understand what the other possible model is here,
honestly. Right now, control remains within the archive module for the
entire time that a file is being archived. If we generalize the model
to allow multiple files to be in the process of being archived at the
same time, the archive module is going to need to have control as long
as >= 1 of them are in progress, at least AFAICS. If you have some
other idea how it would work, please explain it to me...

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2023-02-19 14:54:40 RE: Support logical replication of DDLs
Previous Message Andrew Dunstan 2023-02-19 13:58:59 Re: Handle TEMP_CONFIG for pg_regress style tests in pg_regress.c