Re: Race conditions in 019_replslot_limit.pl

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Race conditions in 019_replslot_limit.pl
Date: 2022-02-17 04:32:52
Message-ID: 20220217043252.scmcjf7wfhxnb6so@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-16 22:58:19 -0500, Tom Lane wrote:
> I wonder though if an exiting walsender would always take that path.

You're right - the CFI() in PostgresMain(), several replication commands, are
not at all guaranteed to be in a transaction when throwing a FATAL.

I don't quite see where we could be throwing an error with a relevant lwlock
held yet though.

> Maybe it'd be good if we released all LWLocks even if we don't think we're
> in a transaction?

Yea, probably a good idea. I think there's a few hooks could that could
conceivably be in trouble otherwise.

> More generally, it wasn't obvious to me why you thought that
> BaseInit was a good spot to place the on_shmem_exit call.

> It obviously can't be *before* that because of the pgstats
> dependency, but maybe it should be later?

Yea, I agree, it probably could be done later. It's not super obvious when
though. If e.g. somebody wrote a non-database connected bgworker to
synchronize slot state with another node (for failover, there's even patches
for it), it'd be just after BaseInit(). Additionally, it was previously in
ProcKill(), so moving it something registered early-ish seemed sensible.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-02-17 04:34:08 Re: adding 'zstd' as a compression algorithm
Previous Message kuroda.hayato@fujitsu.com 2022-02-17 04:32:26 RE: [Proposal] Add foreign-server health checks infrastructure