Re: Failure of subscription tests with topminnow

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Failure of subscription tests with topminnow
Date: 2021-08-25 12:40:12
Message-ID: CAD21AoD+b8JFhbG8Wn7KHq6UEtNLG_+nk3gOk-Gwh=1dJjUFng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 25, 2021 at 9:23 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Aug 25, 2021 at 5:02 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Aug 25, 2021 at 6:53 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Aug 25, 2021 at 5:43 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Aug 25, 2021 at 4:22 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Wed, Aug 25, 2021 at 8:00 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Tue, Aug 24, 2021 at 11:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > > But will poll function still poll or exit? Have you tried that?
> > > > > >
> > > > > > I have forced that condition with a changed query and found that the
> > > > > > poll will not exit in case of a NULL return.
> > > > > >
> > > > >
> > > > > What if the query in a poll is fired just before we get an error
> > > > > "tap_sub ERROR: replication slot "tap_sub" is active for PID 16336"?
> > > > > Won't at that stage both old and new walsender's are present, so the
> > > > > query might return true. You can check that via debugger by stopping
> > > > > just before this error occurs and then check pg_stat_replication view.
> > > >
> > > > If this error happens then the PID is NOT updated as the pid in the
> > > > Replication slot. I have checked this
> > > > and explained this in my first email itself
> > > >
> > >
> > > Sorry about the above email, I misunderstood. I was looking at
> > > pg_stat_replication_slot rather than pg_stat_replication hence the confusion.
> > > Amit is correct, just prior to the walsender erroring out, it briefly
> > > appears in the
> > > pg_stat_replication, and that is why this error happens. Sorry for the
> > > confusion.
> > > I just confirmed it, got both the walsenders stopped in the debugger:
> > >
> > > postgres=# select pid from pg_stat_replication where application_name = 'sub';
> > > pid
> > > ------
> > > 7899
> > > 7993
> > > (2 rows)
> >
> > IIUC the query[1] used for polling returns two rows in this case: {t,
> > f} or {f, t}. But did poll_query_until() returned OK in this case even
> > if we expected one row of 't'? My guess of how this issue happened is:
> >
>
> Yeah, we can check this but I guess as soon as it gets 't', the poll
> query will exit.

I did a quick check with the following tap test code:

$node_publisher->poll_query_until('postgres',
qq(
select 1 != foo.column1 from (values(0), (1)) as foo;
));

The query returns {t, f} but poll_query_until() never finished. The
same is true when the query returns {f, t}.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nitin Jadhav 2021-08-25 12:42:09 Re: Multi-Column List Partitioning
Previous Message alvherre@alvh.no-ip.org 2021-08-25 12:32:31 Re: prevent immature WAL streaming