Re: Network failure may prevent promotion

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: hlinnaka(at)iki(dot)fi
Cc: masao(dot)fujii(at)gmail(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Network failure may prevent promotion
Date: 2024-01-29 07:32:06
Message-ID: 20240129.163206.738744107103873975.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you fixing the issue.

At Tue, 23 Jan 2024 11:43:43 +0200, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote i
n
> There's an existing AmWalReceiverProcess() macro too. Let's use that.

Mmm. I sought an Is* function becuase "IsLogicalWorker()" is placed on
the previous line. Our convention regarding those functions (macros)
and variables seems inconsistent. However, I can't say for sure that
we should unify all of them.

> (See also
> https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0%40iki.fi
> for refactoring in this area)
>
> Here's a patch set summarizing the changes so far. They should be
> squashed, but I kept them separate for now to help with review:
>
> 1. revert the revert of 728f86fec6.
> 2. your walrcv_shutdown_deblocking_v2-2.patch
> 3. Also replace libpqrcv_PQexec() and libpqrcv_PQgetResult() with the
> wrappers from libpq-be-fe-helpers.h

Both replacements look fine. I didn't find another instance of similar
code.

> 4. Replace IsWalReceiver() with AmWalReceiverProcess()

Just look fine.

> >> - pqsignal(SIGTERM, SignalHandlerForShutdownRequest); /* request
> >> - shutdown */
> >> + pqsignal(SIGTERM, WalRcvShutdownSignalHandler); /* request shutdown
> >> */
> >>
> >> Can't we just use die(), instead?
> > There was a comment explaining the problems associated with exiting
> > within a signal handler;
> > - * Currently, only SIGTERM is of interest. We can't just exit(1) within
> > - * the
> > - * SIGTERM signal handler, because the signal might arrive in the middle
> > - * of
> > - * some critical operation, like while we're holding a spinlock.
> > - * Instead, the
> > And I think we should keep the considerations it suggests. The patch
> > removes the comment itself, but it does so because it implements our
> > standard process exit procedure, which incorporates points suggested
> > by the now-removed comment.
>
> die() doesn't call exit(1). Unless DoingCommandRead is set, but it
> never is in the walreceiver. It looks just like the new
> WalRcvShutdownSignalHandler() function. Am I missing something?

Ugh.. Doesn't the name 'die()' suggest exit()?
I agree that die() can be used instad.

> Hmm, but doesn't bgworker_die() have that problem with exit(1)ing in
> the signal handler?

I noticed that but ignored for this time.

> I also wonder if we should replace SignalHandlerForShutdownRequest()
> completely with die(), in all processes? The difference is that
> SignalHandlerForShutdownRequest() uses ShutdownRequestPending, while
> die() uses ProcDiePending && InterruptPending to indicate that the
> signal was received. Or do some of the processes want to check for
> ShutdownRequestPending only at specific places, and don't want to get
> terminated at the any random CHECK_FOR_INTERRUPTS()?

At least, pg_log_backend_memory_context(<chkpt_pid>) causes a call to
ProcessInterrupts via "ereport(LOG_SERVER_ONLY" which can lead to an
exit due to ProcDiePending. In this regard, checkpointer clearly
requires the distinction.

Rather than merely consolidating the notification variables and
striving to annihilate CFI calls in the execution path, I
believe we need a shutdown mechanism that CFI doesn't react
to. However, as for the method to achieve this, whether we should keep
the notification variables separate as they are now, or whether it
would be better to introduce a variable that causes CFI to ignore
ProcDiePending, is a matter I think is open to discussion.

Attached patches are the rebased version of v3 (0003 is updated) and
additional 0005 that makes use of die() instead of walreceiver's
custom function.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v4-0001-Revert-Revert-libpqwalreceiver-Convert-to-libpq-b.patch text/x-patch 3.4 KB
v4-0002-Apply-walrcv_shutdown_deblocking_v2-2.patch.patch text/x-patch 6.1 KB
v4-0003-Use-libpq-be-fe-helpers.h-wrappers-more.patch text/x-patch 9.0 KB
v4-0004-Use-existing-AmWalReceiverProcess-function.patch text/x-patch 1.3 KB
v4-0005-Use-die-instead-of-WalRcvShutdownSignalHandler.patch text/x-patch 1.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Wei Wang (Fujitsu) 2024-01-29 07:35:35 RE: Fix inappropriate comments in function ReplicationSlotAcquire
Previous Message Masahiko Sawada 2024-01-29 07:29:10 Re: [PoC] Improve dead tuple storage for lazy vacuum