More race conditions in logical replication

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: More race conditions in logical replication
Date: 2017-07-02 23:54:48
Message-ID: 11904.1499039688@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I noticed a recent failure that looked suspiciously like a race condition:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2017-07-02%2018%3A02%3A07

The critical bit in the log file is

error running SQL: 'psql:<stdin>:1: ERROR: could not drop the replication slot "tap_sub" on publisher
DETAIL: The error was: ERROR: replication slot "tap_sub" is active for PID 3866790'
while running 'psql -XAtq -d port=59543 host=/tmp/QpCJtafT7R dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'DROP SUBSCRIPTION tap_sub' at /home/nm/farm/xlc64/HEAD/pgsql.build/src/test/subscription/../../../src/test/perl/PostgresNode.pm line 1198.

After poking at it a bit, I found that I can cause several different
failures of this ilk in the subscription tests by injecting delays at
the points where a slot's active_pid is about to be cleared, as in the
attached patch (which also adds some extra printouts for debugging
purposes; none of that is meant for commit). It seems clear that there
is inadequate interlocking going on when we kill and restart a logical
rep worker: we're trying to start a new one before the old one has
gotten out of the slot.

I'm not particularly interested in fixing this myself, so I'm just
going to add it to the open items list.

regards, tom lane

Attachment Content-Type Size
break_repl_slot_management.patch text/x-diff 2.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-07-03 00:28:05 Re: pg_ctl wait exit code (was Re: [COMMITTERS] pgsql: Additional tests for subtransactions in recovery)
Previous Message Robert Haas 2017-07-02 22:44:38 Re: protocol version negotiation (Re: Libpq PGRES_COPY_BOTH - version compatibility)