Re: Network failure may prevent promotion

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, masao(dot)fujii(at)gmail(dot)com
Cc: andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Network failure may prevent promotion
Date: 2024-01-23 09:43:43
Message-ID: ea7b8012-b739-436e-afe4-be0b2f69b304@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23/01/2024 10:24, Kyotaro Horiguchi wrote:
> Thank you for looking this!
>
> At Tue, 23 Jan 2024 15:07:10 +0900, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote in
>> Regarding the patch, here are the review comments.
>>
>> +/*
>> + * Is current process a wal receiver?
>> + */
>> +bool
>> +IsWalReceiver(void)
>> +{
>> + return WalRcv != NULL;
>> +}
>>
>> This looks wrong because WalRcv can be non-NULL in processes other
>> than walreceiver.
>
> Mmm. Sorry for the silly mistake. We can use B_WAL_RECEIVER
> instead. I'm not sure if the new function IsWalReceiver() is
> required. The expression "MyBackendType == B_WAL_RECEIVER" is quite
> descriptive. However, the function does make ProcessInterrupts() more
> aligned regarding process types.

There's an existing AmWalReceiverProcess() macro too. Let's use that.

(See also
https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0%40iki.fi
for refactoring in this area)

Here's a patch set summarizing the changes so far. They should be
squashed, but I kept them separate for now to help with review:

1. revert the revert of 728f86fec6.
2. your walrcv_shutdown_deblocking_v2-2.patch
3. Also replace libpqrcv_PQexec() and libpqrcv_PQgetResult() with the
wrappers from libpq-be-fe-helpers.h
4. Replace IsWalReceiver() with AmWalReceiverProcess()

>> - pqsignal(SIGTERM, SignalHandlerForShutdownRequest); /* request shutdown */
>> + pqsignal(SIGTERM, WalRcvShutdownSignalHandler); /* request shutdown */
>>
>> Can't we just use die(), instead?
>
> There was a comment explaining the problems associated with exiting
> within a signal handler;
>
> - * Currently, only SIGTERM is of interest. We can't just exit(1) within the
> - * SIGTERM signal handler, because the signal might arrive in the middle of
> - * some critical operation, like while we're holding a spinlock. Instead, the
>
> And I think we should keep the considerations it suggests. The patch
> removes the comment itself, but it does so because it implements our
> standard process exit procedure, which incorporates points suggested
> by the now-removed comment.

die() doesn't call exit(1). Unless DoingCommandRead is set, but it never
is in the walreceiver. It looks just like the new
WalRcvShutdownSignalHandler() function. Am I missing something?

Hmm, but doesn't bgworker_die() have that problem with exit(1)ing in the
signal handler?

I also wonder if we should replace SignalHandlerForShutdownRequest()
completely with die(), in all processes? The difference is that
SignalHandlerForShutdownRequest() uses ShutdownRequestPending, while
die() uses ProcDiePending && InterruptPending to indicate that the
signal was received. Or do some of the processes want to check for
ShutdownRequestPending only at specific places, and don't want to get
terminated at the any random CHECK_FOR_INTERRUPTS()?

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v3-0001-Revert-Revert-libpqwalreceiver-Convert-to-libpq-b.patch text/x-patch 3.4 KB
v3-0002-Apply-walrcv_shutdown_deblocking_v2-2.patch.patch text/x-patch 6.2 KB
v3-0003-Use-libpq-be-fe-helpers.h-wrappers-more.patch text/x-patch 8.6 KB
v3-0004-Use-existing-AmWalReceiverProcess-function.patch text/x-patch 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2024-01-23 09:51:37 Re: [17] CREATE SUBSCRIPTION ... SERVER
Previous Message Peter Eisentraut 2024-01-23 09:30:05 Re: make dist using git archive