Re: Non-blocking archiver process

From: Patrick Stählin <me(at)packi(dot)ch>
To: Noah Misch <noah(at)leadboat(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Non-blocking archiver process
Date: 2025-07-27 15:45:35
Message-ID: 33c0978a-6f7a-4609-b57e-79db69b23922@packi.ch
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05.07.25 05:01, Noah Misch wrote:
> On Fri, Jul 04, 2025 at 08:46:08AM +0200, Ronan Dunklau wrote:
>> We've noticed a behavior that seems surprising to us.
>> Since DROP DATABASE now waits for a ProcSignalBarrier, it can hang up
>> indefinitely if the archive_command hangs.
>>
>> The reason for this is that the builtin archive module doesn't process any
>> interrupts while the archiving command is running, as it's run with a system()
>> call, blocking undefintely.
>>
>> Before rushing on to implement a non-blocking archive library (perhaps using
>> popen or posix_spawn, while keeping other systems in mind), what unintended
>> consequences would it have to actually run the archive_command in a non-
>> blocking way, and checking interrupts while it runs ?
>
> I can't think of any unintended consequences. I think we just missed this
> when adding the first use of ProcSignalBarrier (v15). Making this easier to
> miss, archiver spent most of its history not connecting to shared memory. Its
> shared memory connection appeared in v14.

I've taken some time, mostly for WIN32, to implement an interruptible
version of archive_command. My WIN32 days are long behind me, so it's
quite possible that this has some faults I'm not seeing. Then again, it
passes CI.
I failed to make it work in WIN32 with popen since the handles it
returns can't be made non-blocking so this change is a bit bigger.

@Ronan: Let me now if you'd like to be attributed more, I took some
inspiration from a private repos with your prototype.

I don't know if I should add that to the running commitfest for PG19 or
if this is something that would need to be backported. Just let me know.

Thanks,
Patrick

Attachment Content-Type Size
0001-Check-for-interrupts-during-archive_command.patch text/x-patch 4.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2025-07-27 17:24:19 Re: DSA overflow in hash join
Previous Message jian he 2025-07-27 15:43:48 implement CAST(expr AS type FORMAT 'template')