Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-09-04 04:42:53
Message-ID: CAA4eK1LDLOy_DaZmwwscbknM51w5PEYeF65NfGHk-nZ1aTb54w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 2, 2021 at 2:51 AM Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
>
>>>
>>> Do we want to consider that the columns specified in the filter must
>>> not have NOT NULL constraint? Because, otherwise, the subscriber will
>>> error out inserting such rows?
>>>
>> I think you mean columns *not* specified in the filter must not have NOT NULL constraint
>> on the subscriber, as this will break during insert, as it will try to insert NULL for columns
>> not sent by the publisher.
>> I will look into fixing this. Probably this won't be a problem in
>> case the column is auto generated or contains a default value.
>>
>
> I am not sure if this needs to be handled. Ideally, we need to prevent the subscriber tables from having a NOT NULL
> constraint if the publisher uses column filters to publish the values of the table. There is no way
> to do this at the time of creating a table on subscriber.
>
> As this involves querying the publisher for this information, it can be done at the time of initial table synchronization.
> i.e error out if any of the subscribed tables has NOT NULL constraint on non-filter columns.
> This will lead to the user dropping and recreating the subscription after removing the
> NOT NULL constraint from the table.
> I think the same can be achieved by doing nothing and letting the subscriber error out while inserting rows.
>

That makes sense and also it is quite possible that users don't have
such columns in the tables on subscribers. I guess we can add such a
recommendation in the docs instead of doing anything in the code.

Few comments:
============
1.
+
+ /*
+ * Cannot specify column filter when REPLICA IDENTITY IS FULL
+ * or if column filter does not contain REPLICA IDENITY columns
+ */
+ if (targetcols != NIL)
+ {
+ if (replidentfull)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("cannot add relation \"%s\" to publication",
+ RelationGetRelationName(targetrel)),
+ errdetail("Cannot have column filter with REPLICA IDENTITY FULL")));

Why do we want to have such a restriction for REPLICA IDENTITY FULL? I
think it is better to expand comments in that regards.

2.
@@ -839,7 +839,6 @@ NextCopyFrom(CopyFromState cstate, ExprContext *econtext,
ereport(ERROR,
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
errmsg("extra data after last expected column")));
-
fieldno = 0;

@@ -944,7 +992,6 @@ logicalrep_write_attrs(StringInfo out, Relation rel)
flags |= LOGICALREP_IS_REPLICA_IDENTITY;

pq_sendbyte(out, flags);
-
/* attribute name */
pq_sendstring(out, NameStr(att->attname));

@@ -953,6 +1000,7 @@ logicalrep_write_attrs(StringInfo out, Relation rel)

/* attribute mode */
pq_sendint32(out, att->atttypmod);
+
}

Spurious line removals and addition.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-04 05:00:17 Re: Column Filtering in Logical Replication
Previous Message Amit Kapila 2021-09-04 04:11:30 Re: Column Filtering in Logical Replication