Re: Support EXCEPT for ALL SEQUENCES publications

From: vignesh C <vignesh21(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support EXCEPT for ALL SEQUENCES publications
Date: 2026-04-13 10:51:18
Message-ID: CALDaNm3fyVKPssTcn=RKjMKYxy8aG_Zstn+e7ctBMVoy1wrxJQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 13 Apr 2026 at 15:46, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Apr 13, 2026 at 9:58 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Shlok.
> >
> > Before this patch goes too far, I had a fundamental question.
> >
> > I understand that sequences and tables are closely related; sequences
> > are just like a single-row table, but they have a RELKIND_SEQUENCE.
> >
> > There is a lot of shared code internally, so I guess it is tempting to
> > specify both the published sequences and tables together in the System
> > Catalog 'pg_publication_rels'.
> >
> > I'm just wondering whether that is really the best way to go?
> >
> > Currently, pg_publication_rels has only tables. So they might be:
> > * only included tables -- from a publication using "FOR TABLE ..."
> > * only excluded tables -- from a publication using "FOR ALL TABLES
> > EXCEPT (TABLE ...)"
> >
> > Because included/excluded tables cannot co-exist, we can easily know
> > the type of the CREATE/ALTER PUBLICATION command and the type of
> > 'pg_publication_rels' content without digging deeper.
> >
> > ~~~
> >
> > But introducing sequences introduces complexity. Now, AFAICT, we
> > cannot know what each row of 'pg_publication_rels' means without
> > inspecting the relation type of that row. e.g. now we have lots of
> > possible combinations like.
> >
> > pg_publication_rels has:
> > * only included tables.
> > * only excluded tables.
> > * only excluded sequences.
> > * excluded tables and excluded sequences.
> > * included tables and excluded sequences.
> >
> > Furthermore, there will be yet more combinations one day if the
> > individual "FOR SEQUENCE ..." syntax is implemented.
> >
> > pg_publication_rels has:
> > * only included sequences
> > * included sequences and included tables
> > * included sequences and excluded tables
> >
> > IIUC, it means that the SQL everywhere now requires joins and relkind
> > checks to identify the type.
> >
> > ~~
> >
> > Furthermore, AFAICT, the 'pg_publication_rels' attributes 'prattrs'
> > and 'prquals' don't even have meaning for sequences. That's another
> > reason why it feels a bit like a square peg was jammed into a round
> > hole just because 'pg_publication_rels' was conveniently available.
> >
> > ~~
> >
> > SUMMARY
> >
> > Given:
> > * Option 1 = use 'pg_publication_rels' for both tables and sequences.
> > * Option 2 = use 'pg_publication_rels' just for tables and use a new
> > 'pg_publication_seq' just for sequences.
> >
> > I'm not convinced that the chosen way (Option 1) is better. Can you
> > explain why it is?
> >
>
> pg_publication_seq seems like an idea worth considering. After giving
> it some thought, we already have a view named
> pg_publication_sequences, and adding another catalog with a similar
> name may introduce confusion. Apart from that, here are my thoughts:
>
> 1)
> pg_publication_rels already represents 'relations', and thus I feel
> sequences already fit naturally into this.
>
> 2)
> I agree that relkind checks may be required, but IMO these are
> typically cheap. In C code paths we already rely heavily on
> cache-lookups such as get_rel_relkind, and thus pg_class.relkind
> access is well-optimized. This is already used widely across the
> codebase, so IMO, extending it here does not introduce meaningful
> overhead.
>
> 3)
> Looking at the code paths accessing pg_publication_rels, they broadly
> fall into 4 categories:
>
> a) Publication DDL (CREATE/ALTER/DROP PUBLICATION)
> b) Tools and queries (e.g., pg_publication_tables view, pg_dump)
> c) Logical decoding (pgoutput paths like get_rel_sync_entry,
> pgoutput_column_list_init, pgoutput_row_filter_init)
> d) DML paths (INSERT/UPDATE) to check publication membership and RI
> applicability (CheckCmdReplicaIdentity)
>
> The first two are infrequent operations. For the logical decoding
> path, relation information is cached in RelationSyncCache, so relkind
> checks are not repeatedly performed in the hot path. For INSERT/UPDATE
> paths, PublicationDesc (rd_pubdesc) is built once per relation and is
> cached and reused, so pg_publication_rels is not accessed heavily here
> too.
>
> ~~
>
> Considering all the above, I feel we should be good with Option 1.
> But, I would like to see what others think on this.

+1 for option 1 and checking relkind for the same reason mentioned by
shveta above. It seems like a reasonable tradeoff compared to
introducing and maintaining a separate pg_publication_seq system
table.

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2026-04-13 11:03:53 Re: Bug: Rule actions see wrong values for generated columns (NEW.gen reads OLD value)
Previous Message Nishant Sharma 2026-04-13 10:32:24 Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL