Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Андрей Рачицкий <therealgofman(at)mail(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Date: 2024-10-19 23:28:15
Message-ID: 20241019232815.c6.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, Sep 19, 2024 at 05:35:33PM -0400, Tom Lane wrote:
> So the fix seems clear to me: RestoreRelationMap needs to happen
> before anything that could result in catalog lookups. I'm kind
> of inclined to move up the adjacent restores of non-transactional
> low-level stuff too, particularly RestoreReindexState which has
> direct impact on how catalog lookups are done.

Thanks for debugging that. RestorePendingSyncs() also changes what
RelationInitPhysicalAddr() puts in the relcache entry, so it needs to stay
with RestoreRelationMap(). I'm attaching the fix I have in mind. Since the
above (commit 126ec0b), the following has been getting an
AssertPendingSyncConsistency() trap:

make check-tests TESTS=reindex_catalog PG_TEST_INITDB_EXTRA_OPTS='-cwal_level=minimal -cmax_wal_senders=0'

In commit range [6e086fa,126ec0b^], that failed differently, reflecting the
outdated relmapper. (6e086fa is mostly innocent here, though.)

I looked for ways we'd regret not back-patching commit 126ec0b and this, but I
didn't find a concrete problem. After InvalidateSystemCaches(), the
back-branch parallel worker relcache contains the nailed relations. Each
entry then has:

- rd_locator possibly outdated
- rd_firstRelfilelocatorSubid==0 (correct for rd_locator)
- rd_isvalid==false, from RelationReloadNailed()

From code comments, I gather RelationCacheInitializePhase2() is the latest we
use an rd_locator without first making the relcache entry rd_isvalid=true. If
that's right, so long as no catalogs get rd_isvalid=true between
InvalidateSystemCaches() and RestorePendingSyncs(), we're okay without a
back-patch.

I was going to add check-world coverage of this case, but I stopped in light
of the tricky deadlocks that led to commit fe4d022 disabling the
reindex_catalog test. check-world does reach it, in inplace.spec, if one runs
with both wal_level=minimal and debug_parallel_query=regress. (inplace.spec
may eventually fail with the same deadlocks. I've not heard of it deadlocking
so far, and being a NO_INSTALLCHECK test helps.)

Attachment Content-Type Size
RestorePendingSyncs-at-RestoreRelationMap-v1.patch text/plain 2.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-10-19 23:48:35 Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Previous Message Tom Lane 2024-10-19 16:05:57 Re: BUG #18657: Using JSON_OBJECTAGG with volatile function leads to segfault

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-10-19 23:48:35 Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Previous Message David G. Johnston 2024-10-19 20:11:22 Improve documentation regarding custom settings, placeholders, and the administrative functions