| From: | vignesh C <vignesh21(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Support EXCEPT for ALL SEQUENCES publications |
| Date: | 2026-04-13 10:51:18 |
| Message-ID: | CALDaNm3fyVKPssTcn=RKjMKYxy8aG_Zstn+e7ctBMVoy1wrxJQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, 13 Apr 2026 at 15:46, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Mon, Apr 13, 2026 at 9:58 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Shlok.
> >
> > Before this patch goes too far, I had a fundamental question.
> >
> > I understand that sequences and tables are closely related; sequences
> > are just like a single-row table, but they have a RELKIND_SEQUENCE.
> >
> > There is a lot of shared code internally, so I guess it is tempting to
> > specify both the published sequences and tables together in the System
> > Catalog 'pg_publication_rels'.
> >
> > I'm just wondering whether that is really the best way to go?
> >
> > Currently, pg_publication_rels has only tables. So they might be:
> > * only included tables -- from a publication using "FOR TABLE ..."
> > * only excluded tables -- from a publication using "FOR ALL TABLES
> > EXCEPT (TABLE ...)"
> >
> > Because included/excluded tables cannot co-exist, we can easily know
> > the type of the CREATE/ALTER PUBLICATION command and the type of
> > 'pg_publication_rels' content without digging deeper.
> >
> > ~~~
> >
> > But introducing sequences introduces complexity. Now, AFAICT, we
> > cannot know what each row of 'pg_publication_rels' means without
> > inspecting the relation type of that row. e.g. now we have lots of
> > possible combinations like.
> >
> > pg_publication_rels has:
> > * only included tables.
> > * only excluded tables.
> > * only excluded sequences.
> > * excluded tables and excluded sequences.
> > * included tables and excluded sequences.
> >
> > Furthermore, there will be yet more combinations one day if the
> > individual "FOR SEQUENCE ..." syntax is implemented.
> >
> > pg_publication_rels has:
> > * only included sequences
> > * included sequences and included tables
> > * included sequences and excluded tables
> >
> > IIUC, it means that the SQL everywhere now requires joins and relkind
> > checks to identify the type.
> >
> > ~~
> >
> > Furthermore, AFAICT, the 'pg_publication_rels' attributes 'prattrs'
> > and 'prquals' don't even have meaning for sequences. That's another
> > reason why it feels a bit like a square peg was jammed into a round
> > hole just because 'pg_publication_rels' was conveniently available.
> >
> > ~~
> >
> > SUMMARY
> >
> > Given:
> > * Option 1 = use 'pg_publication_rels' for both tables and sequences.
> > * Option 2 = use 'pg_publication_rels' just for tables and use a new
> > 'pg_publication_seq' just for sequences.
> >
> > I'm not convinced that the chosen way (Option 1) is better. Can you
> > explain why it is?
> >
>
> pg_publication_seq seems like an idea worth considering. After giving
> it some thought, we already have a view named
> pg_publication_sequences, and adding another catalog with a similar
> name may introduce confusion. Apart from that, here are my thoughts:
>
> 1)
> pg_publication_rels already represents 'relations', and thus I feel
> sequences already fit naturally into this.
>
> 2)
> I agree that relkind checks may be required, but IMO these are
> typically cheap. In C code paths we already rely heavily on
> cache-lookups such as get_rel_relkind, and thus pg_class.relkind
> access is well-optimized. This is already used widely across the
> codebase, so IMO, extending it here does not introduce meaningful
> overhead.
>
> 3)
> Looking at the code paths accessing pg_publication_rels, they broadly
> fall into 4 categories:
>
> a) Publication DDL (CREATE/ALTER/DROP PUBLICATION)
> b) Tools and queries (e.g., pg_publication_tables view, pg_dump)
> c) Logical decoding (pgoutput paths like get_rel_sync_entry,
> pgoutput_column_list_init, pgoutput_row_filter_init)
> d) DML paths (INSERT/UPDATE) to check publication membership and RI
> applicability (CheckCmdReplicaIdentity)
>
> The first two are infrequent operations. For the logical decoding
> path, relation information is cached in RelationSyncCache, so relkind
> checks are not repeatedly performed in the hot path. For INSERT/UPDATE
> paths, PublicationDesc (rd_pubdesc) is built once per relation and is
> cached and reused, so pg_publication_rels is not accessed heavily here
> too.
>
> ~~
>
> Considering all the above, I feel we should be good with Option 1.
> But, I would like to see what others think on this.
+1 for option 1 and checking relkind for the same reason mentioned by
shveta above. It seems like a reasonable tradeoff compared to
introducing and maintaining a separate pg_publication_seq system
table.
Regards,
Vignesh
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Dean Rasheed | 2026-04-13 11:03:53 | Re: Bug: Rule actions see wrong values for generated columns (NEW.gen reads OLD value) |
| Previous Message | Nishant Sharma | 2026-04-13 10:32:24 | Re: [BUG] CRASH: ECPGprepared_statement() and ECPGdeallocate_all() when connection is NULL |