Re: Race conditions in 019_replslot_limit.pl

From: Andres Freund <andres(at)anarazel(dot)de>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Race conditions in 019_replslot_limit.pl
Date: 2022-02-16 17:26:25
Message-ID: 20220216172625.4q5ziexufyw45lyf@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-02-16 18:04:14 +0900, Masahiko Sawada wrote:
> On Wed, Feb 16, 2022 at 3:22 PM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> >
> > At Wed, 16 Feb 2022 14:58:23 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in
> > > Or it's possible that the process took a time to clean up the
> > > temporary replication slot?
> >
> > Checkpointer may take ReplicationSlotControlLock. Dead lock between
> > ReplicationSlotCleanup and InvalidateObsoleteReplicationSlots
> > happened?

A deadlock requires some form of incorrected lock (or lock like) nesting. Do
you have an idea what that could be?

> That's possible. Whatever the exact cause of this failure, I think we
> can stabilize this test by adding a condition of application_name to
> the query.

I think the test is telling us that something may be broken. We shouldn't
silence that without at least some understanding what it is.

It'd be good try to reproduce this locally...

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-02-16 17:37:21 Re: USE_BARRIER_SMGRRELEASE on Linux?
Previous Message Tom Lane 2022-02-16 17:04:22 check-world has suddenly started spewing stuff on stderr