| From: | Roberto Mello <roberto(dot)mello(at)gmail(dot)com> |
|---|---|
| To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
| Cc: | Álvaro Herrera <alvherre(at)kurilemu(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_publication_tables: return NULL attnames when no column list is specified |
| Date: | 2026-03-31 23:23:11 |
| Message-ID: | CAKz==b+MwMXSqfXyq90PW4qmNDTqrVAoTLAG3BsK-tpq54WkOg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Mar 31, 2026 at 4:55 PM David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> IIUC the wording for v18 and earlier should read more like:
>
> “Subscriptions having several publications in which the same table has
> different sets of columns published are not supported.”
>
> The claim that this defacto behavior is a bug needing to be fixed is now
> before us (there is no disagreement that the physical column lists are
> different - null vs non-null). My cursory take at this leads me to believe
> we should accept what actually got implemented and not call this a bug to
> be fixed (aside from the docs).
>
> That the catalog is the only official source of truth regarding the
> physical column list distinction, and the function represents the logical
> “set of columns actually seen”, makes sense seen in that light.
>
The internal code was designed around the NULL/non-NULL distinction. The
SRF
pg_get_publication_tables() is the one place that erased it, and the CASE
WHEN relnatts heuristic
in tablesync was an attempt to reverse that erasure, but it's demonstrably
broken for
tables with dropped columns. That seems like a bug to me regardless of how
we feel about the
behavioral question, but I have no objections to not calling it a bug. I'm
confident the
best thing was intended when the code was committed and hindsight is always
20/20.
I haven’t dived deep enough to understand whether there is C code issue
> that needs to be resolved. Or whether we can make dealing with this more
> user-friendly given this constraint.
>
> Removing the limitation would seem more appealing if we are going to make
> a change. The obvious answer of “union all sets of columns published for a
> table and replicate those” would be the simplest to document though I
> suspect the current implementation basically chooses one of the
> publications to pull from which makes that difficult in the general case.
> I do kinda wonder why we need to enforce any kind of error so long as one
> of the publications for a given table includes all columns though. Or even
> is a proper superset to be a tiny bit more flexible. A technically
> uninformed wondering but still.
>
The superset idea would be a significant change to how the WAL output
plugin works. pgoutput.c
doesn't have a concept of "this publication contributes columns X and that
publication contributes
columns Y, send the union."
This would be an interesting improvement but it's a larger project... it
would touch pgoutput, tablesync,
and the subscriber's relation mapping. My patch is trying to fix the
immediate inconsistency (the view
lying about the catalog state, and the broken relnatts heuristic) without
changing the replication protocol
or column merging behavior.
If the view shows {id, name} for both publications, a DBA planning a schema
migration has no way to
know that ALTER TABLE ADD COLUMN email will be replicated for one
publication but not the
other. The catalog stores the information needed to make this
determination, the view actively hides it.
NULL in the view would tell the DBA "this publication replicates
everything, including future columns"
which is actionable information.
Roberto Mello
Snowflake
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zsolt Parragi | 2026-03-31 23:30:32 | Re: pg_get__*_ddl consolidation |
| Previous Message | Zsolt Parragi | 2026-03-31 23:20:00 | Re: table AM option passing |