Re: logical replication worker accesses catalogs in error context callback

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: logical replication worker accesses catalogs in error context callback
Date: 2021-07-03 16:31:02
Message-ID: 1270252.1625329862@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> writes:
> Isn't it better to have the below comment before
> slot_store_error_callback, similar to what's there before
> conversion_error_callback in v7-0002.
> * Note that this function mustn't do any catalog lookups, since we are in
> * an already-failed transaction.

Not really, as there's not much temptation to do so in the current form
of that function. I have no desire to go around and plaster every
errcontext callback with such comments.

> Maybe it's worth considering
> avoid_sys_cache_lookup_in_error_callback_v2.patch from [1] as another
> way to enforce the above statement.

That looks fundamentally broken from here. Wouldn't it forbid
perfectly OK code like this randomly-chosen example from tablecmds.c?

if (list_member_oid(inheritOids, parentOid))
ereport(ERROR,
(errcode(ERRCODE_DUPLICATE_TABLE),
errmsg("relation \"%s\" would be inherited from more than once",
get_rel_name(parentOid))));

That is, it's a bit hard to say exactly where in the error processing
sequence we should start deeming it unsafe to do a catalog lookup;
but the mere presence of an error stack entry can't mean that.

In a lot of situations, there wouldn't be any need to consider the
transaction broken until we've done a longjmp, so that catalog
lookups in errcontext callbacks would be perfectly safe. (Which
indeed is why we've not yet seen an actual problem in either of
the spots discussed in this thread.) The reason for being paranoid
about what an errcontext callback can do is that the callback cannot
know what the current failure's cause is. If we're trying to report
some internal error that means that the transaction really is pretty
broken, then it'd be problematic to initiate new catalog accesses.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-03 16:33:03 Re: logical replication worker accesses catalogs in error context callback
Previous Message Gilles Darold 2021-07-03 15:46:10 Re: [PATCH] Hooks at XactCommand level