Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-11-24 13:03:32
Message-ID: CAA4eK1+Xd=kM5D3jtXyN+W7J+wU-yyQAdyq66a6Wcq_PKRTbSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> Attaching a new patchset v41 which includes changes by both Peter and myself.
>
> Patches v40-0005 and v40-0006 have been merged to create patch
> v41-0005 which reduces the patches to 6 again.
> This patch-set contains changes addressing the following review comments:
>
> On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > What I meant was that with this new code we have regressed the old
> > behavior. Basically, imagine a case where no filter was given for any
> > of the tables. Then after the patch, we will remove all the old tables
> > whereas before the patch it will remove the oldrels only when they are
> > not specified as part of new rels. If you agree with this, then we can
> > retain the old behavior and for the new tables, we can always override
> > the where clause for a SET variant of command.
>
> Fixed and modified the behaviour to match with what the schema patch
> implemented.
>

+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)

Can't we use _equalPublicationTable() here? It compares the whereClause as well.

Few more comments:
=================
0001
1.
@@ -1039,10 +1081,11 @@ PublicationAddTables(Oid pubid, List *rels,
bool if_not_exists,
{
PublicationRelInfo *pub_rel = (PublicationRelInfo *) lfirst(lc);
Relation rel = pub_rel->relation;
+ Oid relid = RelationGetRelid(rel);
ObjectAddress obj;

/* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(relid, GetUserId()))

Here, you can directly use RelationGetRelid as was used in the
previous code without using an additional variable.

0005
2.
+typedef struct {
+ Relation rel;
+ bool check_replident;
+ Bitmapset *bms_replident;
+}
+rf_context;

Add rf_context in the same line where } ends.

3. In the function header comment of rowfilter_walker, you mentioned
the simple expressions allowed but we should write why we are doing
so. It has been discussed in detail in various emails in this thread.
AFAIR, below are the reasons:
A. We don't want to allow user-defined functions or operators because
(a) if the user drops such a function/operator or if there is any
other error via that function, the walsender won't be able to recover
from such an error even if we fix the function's problem because it
uses a historic snapshot to access row-filter; (b) any other table
could be accessed via a function which won't work because of historic
snapshots in logical decoding environment.

B. We don't allow anything other immutable built-in functions as those
can access database and would lead to the problem (b) mentioned in the
previous paragraph.

Don't we need to check for user-defined types similar to user-defined
functions and operators? If not why?

4.
+ * Rules: Node-type validation
+ * ---------------------------
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or

It seems Var Op Var is allowed. I tried below and it works:
create publication pub for table t1 where (c1 < c2) WITH (publish = 'insert');

I think it should be okay to allow it provided we ensure that we never
access some other table/view etc. as part of the expression. Also, we
should document the behavior correctly.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua Brindle 2021-11-24 13:46:35 Re: Support for NSS as a libpq TLS backend
Previous Message Masahiko Sawada 2021-11-24 12:48:01 Re: Rename dead_tuples to dead_items in vacuumlazy.c