Re: pg_publication_tables: return NULL attnames when no column list is specified

From: Roberto Mello <roberto(dot)mello(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_publication_tables: return NULL attnames when no column list is specified
Date: 2026-03-30 18:32:17
Message-ID: CAKz==bJwd79+TMOgaxG2jBCoz8b=Nt3v3RR=fA29A=0NhFhcPQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 30, 2026 at 1:21 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Thu, Mar 26, 2026 at 1:21 AM Roberto Mello <roberto(dot)mello(at)gmail(dot)com>
> wrote:
>
> > By making them indistinguishable, the synthesis hid a real conflict from
> users
> > who had a table in two publications with different column semantics on
> the
> > same subscription. I am proposing a fix that restores the distinction
> and correctly
> > (IMO) surfaces this conflict.
> >
>
> I would like to understand why shall we consider this as a conflict?
> IIRC, we tried to ensure that if in future new columns get added to
> the relation and the same is not updated in the explicit column list
> then it will result in error.
>

Hi Amit,

The conflict exists because the two publications have different contracts
about future
schema changes, and the subscriber has no way to honor both simultaneously.

For example:

CREATE TABLE t (id int, name text);
CREATE PUBLICATION pub_a FOR TABLE t; -- no column list
CREATE PUBLICATION pub_b FOR TABLE t (id, name); -- explicit list
CREATE SUBSCRIPTION sub CONNECTION '...' PUBLICATION pub_a, pub_b;

At this point both publications replicate the same columns. But after:

ALTER TABLE t ADD COLUMN email text;

pub_a now replicates {id, name, email} (automatically, because no column
list
means all current and future columns), while pub_b still replicates {id,
name}
(the explicit list hasn't been altered).

The subscriber receives WAL from both publications for the same table.
Which column set should it apply? It cannot apply both as they disagree on
whether email is included. This is exactly the situation the "cannot use
different column lists" check was designed to prevent.

The current code suppresses this error by making the two cases look
identical at query time. But the underlying catalog still stored NULL
for pub_a and {1,2} for pub_b, and the actual replication behavior at WAL
decode time (in pgoutput.c) still treated them differently. So the conflict
was real
but hidden from the check that was supposed to detect it.

To put it another way: the check's purpose is to ensure a single consistent
column set for each table across all publications on a subscription. Two
publications that will diverge on the next ALTER TABLE ADD COLUMN
do not provide that guarantee. Surfacing the conflict at subscription
creation / refresh time - before the schema change happens - is better than
discovering it after, when the subscriber receives incompatible column sets
and replication breaks.

For users who currently have this configuration, the fix is straightforward:
either drop the explicit column list from pub_b (so both mean "all
columns"),
or keep the explicit list and use a separate subscription.

Roberto Mello
Snowflake

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-30 18:35:03 Re: [BUG] Excessive memory usage with update on STORED generated columns.
Previous Message Tomas Vondra 2026-03-30 18:21:29 Re: EXPLAIN: showing ReadStream / prefetch stats