Re: Failure in subscription test 004_sync.pl

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: Failure in subscription test 004_sync.pl
Date: 2021-06-12 17:51:00
Message-ID: 2348394.1623520260@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Sat, Jun 12, 2021 at 1:13 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> wrasse has just failed with what looks like a timing error with a
>> replication slot drop:
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2021-06-12%2006%3A16%3A30

> If we want to fix this, we might want to wait till the slot is active
> on the publisher before trying to drop it but not sure if it is a good
> idea. In the worst case, if the user retries this operation (Drop
> Subscription), it will succeed.

wrasse's not the only animal reporting this type of failure.
See also

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2021-06-12%2011%3A32%3A04

error running SQL: 'psql:<stdin>:1: ERROR: could not drop replication slot "pg_16387_sync_16384_6972886888894805957" on publisher: ERROR: replication slot "pg_16387_sync_16384_6972886888894805957" is active for PID 2971625'
while running 'psql -XAtq -d port=60321 host=/tmp/vdQmH7ijFI dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION testsub2' at /home/bf/build/buildfarm-komodoensis/HEAD/pgsql.build/../pgsql/src/test/perl/PostgresNode.pm line 1771.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio&dt=2021-06-11%2020%3A30%3A28

error running SQL: 'psql:<stdin>:1: ERROR: could not drop replication slot "testsub2" on publisher: ERROR: replication slot "testsub2" is active for PID 27175'
while running 'psql -XAtq -d port=59579 host=/tmp/9Qchjsykek dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION testsub2' at /home/pgbf/buildroot/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm line 1771.

Those are both in the t/100_bugs.pl test script, but otherwise they
look like the exact same thing.

I don't think that it's optional to fix a problem that occurs as
often as three times in 10 days in the buildfarm.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-06-12 17:54:59 Re: Race condition in recovery?
Previous Message Andrew Dunstan 2021-06-12 17:44:44 Re: Race condition in recovery?