locking [user] catalog tables vs 2pc vs logical rep

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Markus Wanner <markus(dot)wanner(at)enterprisedb(dot)com>
Subject: locking [user] catalog tables vs 2pc vs logical rep
Date: 2021-02-22 22:28:47
Message-ID: 20210222222847.tpnb6eg3yiykzpky@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

The 2pc decoding added in

commit a271a1b50e9bec07e2ef3a05e38e7285113e4ce6
Author: Amit Kapila <akapila(at)postgresql(dot)org>
Date: 2021-01-04 08:34:50 +0530

Allow decoding at prepare time in ReorderBuffer.

has a deadlock danger when used in a way that takes advantage of
separate decoding of the 2PC PREPARE.

I assume the goal of decoding the 2PC PREPARE is so one can wait for the
PREPARE to have logically replicated, before doing the COMMIT PREPARED.

However, currently it's pretty easy to get into a state where logical
decoding cannot progress until the 2PC transaction has
committed/aborted. Which essentially would lead to undetected deadlocks.

The problem is that catalog tables accessed during logical decoding need
to get locked (otherwise e.g. a table rewrite could happen
concurrently). But if the prepared transaction itself holds a lock on a
catalog table, logical decoding will block on that lock - which won't be
released until replication progresses. A deadlock.

A trivial example:

SELECT pg_create_logical_replication_slot('test', 'test_decoding');
CREATE TABLE foo(id serial primary key);
BEGIN;
LOCK pg_class;
INSERT INTO foo DEFAULT VALUES;
PREPARE TRANSACTION 'foo';

-- hangs waiting for pg_class to be unlocked
SELECT pg_logical_slot_get_changes('test', NULL, NULL, 'two-phase-commit', '1');

Now, more realistic versions of this scenario would probably lock a
'user catalog table' containing replication metadata instead of
pg_class, but ...

At first this seems to be a significant issue. But on the other hand, if
you were to shut the cluster down in this situation (or disconnect all
sessions), you have broken cluster on your hand - without logical
decoding being involved. As it turns out, we need to read pg_class to
log in... And I can't remember this being reported to be a problem?

Perhaps all that we need to do is to disallow 2PC prepare if [user]
catalog tables have been locked exclusively? Similar to how we're
disallowing preparing tables with temp table access.

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-02-22 22:32:13 Re: Autovacuum worker doesn't immediately exit on postmaster death
Previous Message Daniel Gustafsson 2021-02-22 21:29:12 Re: libpq compression