Quick Links

Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE

From:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE
Date:	2026-06-23 22:40:12
Message-ID:	CALj2ACU6JY3cfUYZ3G1ZUwKUx-X64hbzQaLAQvuKk54oyP_=Pw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On Tue, Jun 16, 2026 at 6:20 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> > However, I would like to optimize this part of the code by 1/ gating
> > it with InvalidOid checkup, 2/ moving the schemaid fetch closer to
> > where it's being used i.e. inside if (!pub->alltables). I would like
> > to do this only for HEAD, so, I attached it as a separate 0002 patch.
> > For pre-HEAD versions, I removed this check and retained the
> > try_table_open() fix.
>
> I'm not sure skipping get_rel_namespace() contributes to a visible
> performance gain. Rather I'm concerned that checking
> SearchSysCacheExists2(PUBLICATIONNAMESPACEMAP) with
> schemaid=InvalidOid could lead to unexpected results.

Thanks for reviewing!

AFAICS it won't give unexpected results. The first time it looks up a
publication with an invalid schema OID, it would cause a cache miss
and leave a negative cache entry - but that's per publication, not per
relation, so negligible.

That said, with the approach of using try_table_open for all tables
regardless and getting the schema from the open relation via
RelationGetNamespace, schemaid is always valid by the time it reaches
the syscache lookup (might be worth adding an assertion here). So the
get_rel_namespace optimization (0002) can be dropped entirely.

> I'm concerned about the fact that the patch handles only the case
> where the table doesn't have the column list. The tables in the result
> not having the column list are guaranteed to be present since we do
> the existence check for them but others are not. I guess it's reliable
> if we could call try_table_open() for all tables in the table_infos,
> and check its namespace using RelationGetNamespace(). I think it
> doesn't lead to noticeable overheads in practice as we're already
> calling table_open() for all tables in ALL TABLES publications or ALL
> TABLE IN SCHEMA publications where users cannot specify the column
> list. What do you think?

My earlier argument was that no function returning table OIDs can
guarantee they remain valid - a drop can happen right after we return
the OID, and accuracy is in the caller's hands. All the callers of
pg_get_publication_tables already handle this by JOINing with
pg_class.

However, a closer look at other functions that either build a list of
table OIDs (expand_partitioned_rtentry) or work on previously built
table OIDs (vacuum_open_relation) proves me wrong - they all account
for concurrent table drops with try_table_open. So, I'm convinced to
add try_table_open in pg_get_publication_tables for all the tables
regardless, unless I'm missing something here.

> > > If we use tuplestore instead of SRF, can we simplify the code as we
> > > would not need publication_tables_state and the above check? It would
> > > be only for the master, though.
> >
> > I implemented this idea for HEAD and it simplifies the code a bit.
>
> Sorry, I should have mentioned that the changes to use tuplestore
> instead of SRF should be PG20. I think that we should not such a
> refactoring even PG19.

I will drop the tuplestore changes for now and repost them as a
refactoring patch after the PG20 dev branch is cut.

Thoughts?

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

In response to

Re: Fix race condition in pg_get_publication_tables with concurrent DROP TABLE at 2026-06-17 01:20:17 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alexander Nestorov	2026-06-23 22:53:47	Re: [PATCH] btree_gist: add cross-type integer operator support for GiST
Previous Message	Ben Mejia	2026-06-23 22:36:06	Re: hashjoins vs. Bloom filters (yet again)