Re: the s_lock_stuck on perform_spin_delay

From: Andres Freund <andres(at)anarazel(dot)de>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <tmunro(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: the s_lock_stuck on perform_spin_delay
Date: 2024-01-23 20:15:12
Message-ID: 20240123201512.5t2xlo6vtsuqj2fi@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-01-22 15:18:35 +0800, Andy Fan wrote:
> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
> quickdie, however this is bad not only because it doesn't work on
> Windows, but also it has too poor performance even it impacts on
> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
> variable quickDieInProgress to handle this.

For reasons you already noted, using sigismember() isn't viable. But I am not
convinced by quickDieInProgress either. I think we could just reset the state
for the spinlock check in the code handling PANIC, perhaps via a helper
function in spin.c.

> +void
> +VerifyNoSpinLocksHeld(void)
> +{
> +#ifdef USE_ASSERT_CHECKING
> + if (last_spin_lock_file != NULL)
> + elog(PANIC, "A spin lock has been held at %s:%d",
> + last_spin_lock_file, last_spin_lock_lineno);
> +#endif
> +}

I think the #ifdef for this needs to be in the header, not here. Otherwise we
add a pointless external function call to a bunch of performance critical
code.

> From f09518df76572adca85cba5008ea0cae5074603a Mon Sep 17 00:00:00 2001
> From: "yizhi.fzh" <yizhi(dot)fzh(at)alibaba-inc(dot)com>
> Date: Fri, 19 Jan 2024 13:57:46 +0800
> Subject: [PATCH v8 2/3] Treat (un)LockBufHdr as a SpinLock.
>
> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
> infrastructure and so it is also possible that PANIC the system
> when it can't be acquired in a short time, and its code is pretty
> similar with s_lock. so treat it same as SPIN lock when regarding to
> misuse of spinlock detection.
> ---
> src/backend/storage/buffer/bufmgr.c | 1 +
> src/include/storage/buf_internals.h | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index 7d601bef6d..c600a113cf 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -5409,6 +5409,7 @@ LockBufHdr(BufferDesc *desc)
>
> init_local_spin_delay(&delayStatus);
>
> + START_SPIN_LOCK();
> while (true)
> {
> /* set BM_LOCKED flag */

Seems pretty odd that we now need init_local_spin_delay() and
START_SPIN_LOCK(). Note that init_local_spin_delay() also wraps handling of
__FILE__, __LINE__ etc, so it seems we're duplicating state here.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-01-23 20:20:31 Re: Parallelize correlated subqueries that execute within each worker
Previous Message Tristan Partin 2024-01-23 20:10:48 Re: Remove pthread_is_threaded_np() checks in postmaster