Re: Adding REPACK [concurrently]

From: Andres Freund <andres(at)anarazel(dot)de>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-04-07 15:48:42
Message-ID: pveffyxhnuurhb44uzqlwo3rkyzorkfh2rot7uwzlf2axhfvbp@7nrs2omysxkc
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2026-04-07 17:38:24 +0200, Antonin Houska wrote:
> Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > On 2026-04-07 14:33:50 +0200, Alvaro Herrera wrote:
> > > On 2026-Apr-07, Amit Kapila wrote:
> > >
> > > > I have a question based on 0001's commit message: "This patch adds a
> > > > new option to logical replication output plugin, to declare that it
> > > > does not use shared catalogs (i.e. catalogs that can be changed by
> > > > transactions running in other databases in the cluster).". In which
> > > > cases, currently plugin needs to access multi-database transactions or
> > > > transactions that need to access shared catalogs and on what basis a
> > > > plugin can decide that the changes it requires won't need any such
> > > > access.
> > >
> > > I don't think any plugin needs "multi-database" access as such, but
> > > needing access to shared catalogs is likely normal. Repack knows it
> > > won't access any shared catalogs, so it can set the flag at ease.
> > >
> > > There's a cross-check added in the commit that tests for access to
> > > shared catalogs if the flag is set to false. I guess you could set it
> > > to false and see what breaks :-)
> >
> > I think this has a quite high chance of indirect breakages. You just need some
> > cache invalidation processing / building accessing shared catalogs to violate
> > the rule, and whether that happens very heavily depends on what cache entries
> > are present and whether something registers relcache callbacks or such.
> >
> > This can be triggered by an output function during logical decoding or such,
> > so you don't really have control over it.
>
> The REPACK plugin only deforms tuples and writes them to a file, so I think
> that things like this should not happen.

You don't need to do it yourself. It just requires a shared_preload_library
extension to register a relcache invalidation callback that accesses shared
catalog.

It's only kind of an accident that we don't have a case today that accesses
shared catcaches during a relcache build in core (I'm not even sure there's
nothing). You'd just need somebody to add e.g. relcache caching for
publications for that to change. Or look up information about a reloption in
pg_parameter_acl. Or lookup tablespace configuration.

> However, I admit that an option that allows the plugin developer to declare
> "I don't need shared catalogs" may be considered deceptive.

At the very least it would need to be a runtime check rather than just an
assert. This would much more likely to be hit in production because otherwise
it's probably hard to hit the case where shared invalidations happen in the
wrong moment. And the consequences are corrupted caches, which could cause
all kinds of havoc.

But I think this may need more infrastructure / deeper analysis than what we
can do right now.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2026-04-07 15:55:23 Re: pg_buffercache: Add per-relation summary stats
Previous Message Antonin Houska 2026-04-07 15:38:24 Re: Adding REPACK [concurrently]