From: | Patrick Stählin <me(at)packi(dot)ch> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Non-blocking archiver process |
Date: | 2025-07-27 15:45:35 |
Message-ID: | 33c0978a-6f7a-4609-b57e-79db69b23922@packi.ch |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 05.07.25 05:01, Noah Misch wrote:
> On Fri, Jul 04, 2025 at 08:46:08AM +0200, Ronan Dunklau wrote:
>> We've noticed a behavior that seems surprising to us.
>> Since DROP DATABASE now waits for a ProcSignalBarrier, it can hang up
>> indefinitely if the archive_command hangs.
>>
>> The reason for this is that the builtin archive module doesn't process any
>> interrupts while the archiving command is running, as it's run with a system()
>> call, blocking undefintely.
>>
>> Before rushing on to implement a non-blocking archive library (perhaps using
>> popen or posix_spawn, while keeping other systems in mind), what unintended
>> consequences would it have to actually run the archive_command in a non-
>> blocking way, and checking interrupts while it runs ?
>
> I can't think of any unintended consequences. I think we just missed this
> when adding the first use of ProcSignalBarrier (v15). Making this easier to
> miss, archiver spent most of its history not connecting to shared memory. Its
> shared memory connection appeared in v14.
I've taken some time, mostly for WIN32, to implement an interruptible
version of archive_command. My WIN32 days are long behind me, so it's
quite possible that this has some faults I'm not seeing. Then again, it
passes CI.
I failed to make it work in WIN32 with popen since the handles it
returns can't be made non-blocking so this change is a bit bigger.
@Ronan: Let me now if you'd like to be attributed more, I took some
inspiration from a private repos with your prototype.
I don't know if I should add that to the running commitfest for PG19 or
if this is something that would need to be backported. Just let me know.
Thanks,
Patrick
Attachment | Content-Type | Size |
---|---|---|
0001-Check-for-interrupts-during-archive_command.patch | text/x-patch | 4.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Konstantin Knizhnik | 2025-07-27 17:24:19 | Re: DSA overflow in hash join |
Previous Message | jian he | 2025-07-27 15:43:48 | implement CAST(expr AS type FORMAT 'template') |