Re: locking [user] catalog tables vs 2pc vs logical rep

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Markus Wanner <markus(dot)wanner(at)enterprisedb(dot)com>
Subject: Re: locking [user] catalog tables vs 2pc vs logical rep
Date: 2021-05-25 07:10:15
Message-ID: YKyi17ER4pR61fWR@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 24, 2021 at 10:03:01AM +0530, Amit Kapila wrote:
> So, this appears to be an existing caveat of synchronous replication.
> If that is the case, I am not sure if it is a good idea to just block
> such ops for the prepared transaction. Also, what about other
> operations which acquire an exclusive lock on [user]_catalog_tables
> like:
> cluster pg_trigger using pg_class_oid_index, similarly cluster on any
> user_catalog_table, then the other problematic operation could
> truncate of user_catalog_table as is discussed in another thread [1].
> I think all such operations can block even with synchronous
> replication. I am not sure if we can create examples for all cases
> because for ex. we don't have use of user_catalog_tables in-core but
> maybe for others, we can try to create examples and see what happens?
>
> If all such operations can block for synchronous replication and
> prepared transactions replication then we might want to document them
> as caveats at page:
> https://www.postgresql.org/docs/devel/logicaldecoding-synchronous.html
> and then also give the reference for these caveats at prepared
> transactions page:https://www.postgresql.org/docs/devel/logicaldecoding-two-phase-commits.html
>
> What do you think?

It seems to me that the 2PC issues on catalog tables and the issues
related to logical replication in synchonous mode are two distinct
things that need to be fixed separately.

The issue with LOCK taken on a catalog while a PREPARE TRANSACTION
holds locks around is bad enough in itself as it could lock down a
user from a cluster as long as the PREPARE TRANSACTION is not removed
from WAL (say the relation is critical for the connection startup).
This could be really disruptive for the user even if he tried to take
a lock on an object he owns, and the way to recover is not easy here,
and the way to recover involves either an old backup or worse,
pg_resetwal.

The second issue with logical replication is still disruptive, but it
looks to me more like a don't-do-it issue, and documenting the caveats
sounds fine enough.

Looking at the patch from upthread..

+ /*
+ * Make note that we've locked a system table or an user catalog
+ * table. This flag will be checked later during prepare transaction
+ * to fail the prepare transaction.
+ */
+ if (lockstmt->mode >= ExclusiveLock &&
+ (IsCatalogRelationOid(reloid) ||
+ RelationIsUsedAsCatalogTable(rel)))
+ MyXactFlags |= XACT_FLAGS_ACQUIREDEXCLUSIVELOCK_SYSREL;
I think that I'd just use IsCatalogRelationOid() here, and I'd be more
severe and restrict all attempts for any lock levels. It seems to me
that this needs to happen within RangeVarCallbackForLockTable().
I would also rename the flag as just XACT_FLAGS_LOCKEDCATALOG.

+ errmsg("cannot PREPARE a transaction that has an exclusive lock on user catalog/system table(s)")));
What about "cannot PREPARE a transaction that has locked a catalog
relation"?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-05-25 07:21:45 Re: Assertion failure while streaming toasted data
Previous Message Masahiko Sawada 2021-05-25 06:55:36 Re: Skipping logical replication transactions on subscriber side