RE: row filtering for logical replication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: row filtering for logical replication
Date: 2021-11-25 13:39:45
Message-ID: OS0PR01MB57168FD9932E3F42406EB13B94629@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 24, 2021 1:46 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Nov 24, 2021 at 6:51 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Tues, Nov 23, 2021 6:16 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > On Tue, Nov 23, 2021 at 1:29 PM houzj(dot)fnst(at)fujitsu(dot)com
> > > <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > > >
> > > > On Tues, Nov 23, 2021 2:27 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > > > On Thu, Nov 18, 2021 at 7:04 AM Peter Smith
> > > > > <smithpb2250(at)gmail(dot)com>
> > > > > wrote:
> > > > > >
> > > > > > PSA new set of v40* patches.
> > > > >
> > > > > Few comments:
> > > > > 1) When a table is added to the publication, replica identity is
> > > > > checked. But while modifying the publish action to include
> > > > > delete/update, replica identity is not checked for the existing
> > > > > tables. I felt it should be checked for the existing tables too.
> > > >
> > > > In addition to this, I think we might also need some check to
> > > > prevent user from changing the REPLICA IDENTITY index which is used in
> > > > the filter expression.
> > > >
> > > > I was thinking is it possible do the check related to REPLICA
> > > > IDENTITY in function CheckCmdReplicaIdentity() or In
> > > > GetRelationPublicationActions(). If we move the REPLICA IDENTITY
> > > > check to this function, it would be consistent with the existing
> > > > behavior about the check related to REPLICA IDENTITY(see the
> > > > comments in CheckCmdReplicaIdentity) and seems can cover all the cases
> > > > mentioned above.
> > >
> > > Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
> > > would cover all the above cases but I think that would put a premium
> > > on each update/delete operation. I think traversing the expression
> > > tree (it could be multiple traversals if the relation is part of
> > > multiple publications) during each update/delete would be costly.
> > > Don't you think so?
> >
> > Yes, I agreed that traversing the expression every time would be costly.
> >
> > I thought maybe we can cache the columns used in row filter or cache
> > only the a
> > flag(can_update|delete) in the relcache. I think every operation that
> > affect the row-filter or replica-identity will invalidate the relcache
> > and the cost of check seems acceptable with the cache.
> >
>
> I think if we can cache this information especially as a bool flag then that should
> probably be better.

When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.

It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?

For example:
---- Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');

---- Subscription
create table tbl1 (a int primary key);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=10000' PUBLICATION A,B;

---- Publication
update tbl1 set a = 2;

The publication can be created, and when UPDATE, the rowfilter in A (b<2) will
also been executed but the column in it is not part of replica identity.
(I am not against this behavior just confirm)

Best regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-11-25 13:55:37 Re: pg_upgrade and publication/subscription problem
Previous Message Joshua Brindle 2021-11-25 13:39:29 Re: Support for NSS as a libpq TLS backend