Re: Introduce wait_for_subscription_sync for TAP tests

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce wait_for_subscription_sync for TAP tests
Date: 2022-09-09 21:33:51
Message-ID: CAD21AoAZaQSTWTrT7gd8BpA8MARTYCnc44bRAV=aCXA2Pm4m8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 9, 2022 at 11:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> > Pushed.
>
> Recently a number of buildfarm animals have failed at the same
> place in src/test/subscription/t/100_bugs.pl [1][2][3][4]:
>
> # Failed test '2x3000 rows in t'
> # at t/100_bugs.pl line 149.
> # got: '9000'
> # expected: '6000'
> # Looks like you failed 1 test of 7.
> [09:30:56] t/100_bugs.pl ......................
>
> This was the last commit to touch that test script. I'm thinking
> maybe it wasn't adjusted quite correctly? On the other hand, since
> I can't find any similar failures before the last 48 hours, maybe
> there is some other more-recent commit to blame. Anyway, something
> is wrong there.

It seems that this commit is innocent as it changed only how to wait.
Rather, looking at the logs, the tablesync worker errored out at an
interesting point:

022-09-09 09:30:19.630 EDT [631b3feb.840:13]
pg_16400_sync_16392_7141371862484106124 ERROR: could not find record
while sending logically-decoded data: missing contrecord at 0/1D4FFF8
2022-09-09 09:30:19.630 EDT [631b3feb.840:14]
pg_16400_sync_16392_7141371862484106124 STATEMENT: START_REPLICATION
SLOT "pg_16400_sync_16392_7141371862484106124" LOGICAL 0/0
(proto_version '3', origin 'any', publication_names '"testpub"')
ERROR: could not find record while sending logically-decoded data:
missing contrecord at 0/1D4FFF8
2022-09-09 09:30:19.631 EDT [631b3feb.26e8:2] ERROR: error while
shutting down streaming COPY: ERROR: could not find record while
sending logically-decoded data: missing contrecord at 0/1D4FFF8

It's likely that the commit f6c5edb8abcac04eb3eac6da356e59d399b2bcef
is relevant.

Regards,

--
Masahiko Sawada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-09-09 21:45:31 Re: Avoid overhead with fprintf related functions
Previous Message Tom Lane 2022-09-09 21:31:40 Re: configure --with-uuid=bsd fails on NetBSD