Re: Failure in subscription test 004_sync.pl

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: Failure in subscription test 004_sync.pl
Date: 2021-06-14 08:48:34
Message-ID: CAA4eK1L8KHCxtvMQP64uRfW9ZCKKEVKUOV=4x9hT=7-CpFFD0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 14, 2021 at 10:41 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> >
> > I think it is showing a race condition issue in the code. In
> > DropSubscription, we first stop the worker that is receiving the WAL,
> > and then in a separate connection with the publisher, it tries to drop
> > the slot which leads to this error. The reason is that walsender is
> > still active as we just wait for wal receiver (or apply worker) to
> > stop. Normally, as soon as the apply worker is stopped the walsender
> > detects it and exits but in this case, it took some time to exit, and
> > in the meantime, we tried to drop the slot which is still in use by
> > walsender.
>
> There might be possible.
>
> That's weird since DROP SUBSCRIPTION executes DROP_REPLICATION_SLOT
> command with WAIT option. I found a bug that is possibly an oversight
> of commit 1632ea4368.
>
..
>
> The condition should be the opposite; we should raise the error when
> 'nowait' is true. I think this is the cause of the test failure. Even
> if DROP SUBSCRIPTION tries to drop the slot with the WAIT option, we
> don't wait but raise the error.
>
> Attached a small patch fixes it.
>

Yes, this should fix the recent buildfarm failures. Alvaro, would you
like to take care of this?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2021-06-14 08:57:07 Re: Fix around conn_duration in pgbench
Previous Message Yugo NAGATA 2021-06-14 08:07:02 Re: pgbench bug candidate: negative "initial connection time"