Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-08-09 10:45:32
Message-ID: CAA4eK1J9b_0_PMnJ2jq9E55bcbmTKdUmy6jPnkf1Zwy2jxah_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 9, 2021 at 3:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Aug 9, 2021 at 1:36 AM Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
> >
> >> Having said that, I'm not sure I agree with this design decision; what I
> >> think this is doing is hiding from the user the fact that they are
> >> publishing columns that they don't want to publish. I think as a user I
> >> would rather get an error in that case:
> >
> >
> >> ERROR: invalid column list in published set
> >> DETAIL: The set of published commands does not include all the replica identity columns.
> >
> >
> >> or something like that. Avoid possible nasty surprises of security-
> >> leaking nature.
> >
> >
> > Ok, Thank you for your opinion. I agree that giving an explicit error in this case will be safer.
> >
>
> +1 for an explicit error in this case.
>
> Can you please explain why you have the restriction for including
> replica identity columns and do we want to put a similar restriction
> for the primary key? As far as I understand, if we allow default
> values on subscribers for replica identity, then probably updates,
> deletes won't work as they need to use replica identity (or PK) to
> search the required tuple. If so, shouldn't we add this restriction
> only when a publication has been defined for one of these (Update,
> Delete) actions?
>
> Another point is what if someone drops the column used in one of the
> publications? Do we want to drop the entire relation from publication
> or just remove the column filter or something else?
>
> Do we want to consider that the columns specified in the filter must
> not have NOT NULL constraint? Because, otherwise, the subscriber will
> error out inserting such rows?
>

I noticed that other databases provide this feature [1] and they allow
users to specify "Columns that are included in Filter" or specify "All
columns to be included in filter except for a subset of columns". I am
not sure if want to provide both ways in the first version but at
least we should consider it as a future extensibility requirement and
try to choose syntax accordingly.

[1] - https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-08-09 11:12:11 Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash
Previous Message houzj.fnst@fujitsu.com 2021-08-09 10:33:46 RE: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION