Re: pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Nikhil Benesch <nikhil(dot)benesch(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "sean(at)materialize(dot)com" <sean(at)materialize(dot)com>, "petrosagg(at)materialize(dot)com" <petrosagg(at)materialize(dot)com>
Subject: Re: pgoutput incorrectly replaces missing values with NULL since PostgreSQL 15
Date: 2023-11-23 09:03:14
Message-ID: CAA4eK1+=Ls2b19Nd6g2p_iM41r1UohaN+c8OQQccGhsjX5vnSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 23, 2023 at 1:10 PM Nikhil Benesch <nikhil(dot)benesch(at)gmail(dot)com> wrote:
>
> While working on Materialize's streaming logical replication from Postgres [0],
> my colleagues Sean Loiselle and Petros Angelatos (CC'd) discovered today what
> appears to be a correctness bug in pgoutput, introduced in v15.
>
> The problem goes like this. A table with REPLICA IDENTITY FULL and some
> data in it...
>
> CREATE TABLE t (a int);
> ALTER TABLE t REPLICA IDENTITY FULL;
> INSERT INTO t VALUES (1), (2), (3), ...;
>
> ...undergoes a schema change to add a new column with a default:
>
> ALTER TABLE t ADD COLUMN b bool DEFAULT false NOT NULL;
>
> PostgreSQL is smart and does not rewrite the entire table during the schema
> change. Instead it updates the tuple description to indicate to future readers
> of the table that if `b` is missing, it should be filled in with the default
> value, `false`.
>
> Unfortunately, since v15, pgoutput mishandles missing attributes. If a
> downstream server is subscribed to changes from t via the pgoutput plugin, when
> a row with a missing attribute is updated, e.g.:
>
> UPDATE t SET a = 2 WHERE a = 1
>
> pgoutput will incorrectly report b's value as NULL in the old tuple, rather than
> false.
>

Thanks, I could reproduce this behavior. I'll look into your patch.

> Using the same example:
>
> old: a=1, b=NULL
> new: a=2, b=true
>
> The subscriber will ignore the update (as it has no row with values
> a=1, b=NULL), and thus the subscriber's copy of `t` will become out of sync with
> the publisher's.
>
> I bisected the problem to 52e4f0cd4 [1], which introduced row filtering for
> publications. The problem appears to be the use of CreateTupleDescCopy where
> CreateTupleDescCopyConstr is required, as the former drops the constraints
> in the tuple description (specifically, the default value constraint) on the
> floor.
>
> I've attached a patch which both fixes the issue and includes a test. I've
> verified that the test fails against the current master and passes against
> the patched version.
>
> I'm relatively unfamiliar with the project norms here, but assuming the patch is
> acceptable, this strikes me as important enough to warrant a backport to both
> v15 and v16.
>

Right.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2023-11-23 09:08:54 Re: SQL:2011 application time
Previous Message shveta malik 2023-11-23 08:59:03 Re: Synchronizing slots from primary to standby