Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-09-07 06:43:08
Message-ID: CAA4eK1KCGF43pfLv8+mixcTMs=Nkd6YdWL53LhiT1DvnuTg01g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 7, 2021 at 11:26 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Sep 7, 2021 at 11:06 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Mon, Sep 6, 2021 at 11:21 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>> >
>> > On 2021-Sep-06, Rahila Syed wrote:
>> >
>> > > > > ... ugh. Since CASCADE is already defined to be a
>> > > > > potentially-data-loss operation, then that may be acceptable
>> > > > > behavior. For sure the default RESTRICT behavior shouldn't do it,
>> > > > > though.
>> > > >
>> > > > That makes sense to me.
>> > >
>> > > However, the default (RESTRICT) behaviour of DROP TABLE allows
>> > > removing the table from the publication. I have implemented the
>> > > removal of table from publication on drop column (RESTRICT) on the
>> > > same lines.
>> >
>> > But dropping the table is quite a different action from dropping a
>> > column, isn't it? If you drop a table, it seems perfectly reasonable
>> > that it has to be removed from the publication -- essentially, when the
>> > user drops a table, she is saying "I don't care about this table
>> > anymore". However, if you drop just one column, that doesn't
>> > necessarily mean that the user wants to stop publishing the whole table.
>> > Removing the table from the publication in ALTER TABLE DROP COLUMN seems
>> > like an overreaction. (Except perhaps in the special case were the
>> > column being dropped is the only one that was being published.)
>> >
>> > So let's discuss what should happen. If you drop a column, and the
>> > column is filtered out, then it seems to me that the publication should
>> > continue to have the table, and it should continue to filter out the
>> > other columns that were being filtered out, regardless of CASCADE/RESTRICT.
>> >
>>
>> Yeah, for this case we don't need to do anything and I am not sure if
>> the patch is dropping tables in this case?
>>
>> > However, if the column is *included* in the publication, and you drop
>> > it, ISTM there are two cases:
>> >
>> > 1. If it's DROP CASCADE, then the list of columns to replicate should
>> > continue to have all columns it previously had, so just remove the
>> > column that is being dropped.
>> >
>>
>> Note that for a somewhat similar case in the index (where the index
>> has an expression) we drop the index if one of the columns used in the
>> index expression is dropped, so we might want to just remove the
>> entire filter here instead of just removing the particular column or
>> remove the entire table from publication as Rahila is proposing.
>>
>> I think removing just a particular column can break the replication
>> for Updates and Deletes if the removed column is part of replica
>> identity.
>
>
> But how this is specific to this patch, I think the behavior should be the same as what is there now, I mean now also we can drop the columns which are part of replica identity right.
>

Sure, but we drop replica identity and corresponding index as well.
The patch ensures that replica identity columns must be part of the
column filter and now that restriction won't hold anymore. I think if
we want to retain that restriction then it is better to either remove
the entire filter or remove the entire table. Anyway, the main point
was that if we can remove the index/replica identity, it seems like
there should be the same treatment for column filter.

Another related point that occurred to me is that if the user changes
replica identity then probably we should ensure that the column
filters for the table still holds the creteria or maybe we need to
remove the filter in that case as well. I am not sure if the patch is
already doing something about it and if not then isn't it better to do
something about it?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-07 06:51:53 Re: [BUG] Failed Assertion in ReorderBufferChangeMemoryUpdate()
Previous Message Julien Rouhaud 2021-09-07 06:38:13 Re: [UNVERIFIED SENDER] Re: Challenges preventing us moving to 64 bit transaction id (XID)?