Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2022-02-09 01:37:12
Message-ID: CAHut+Pue8wSU7TTZ5Hy9p6wQ32LVFS1NezT9UtSK-FsDWMYP=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I did a review of the v79 patch. Below are my review comments:

======

1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION

The commit message for v79-0001 says:
<quote>
If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
</quote>

I think that the same information should also be mentioned in the PG
DOCS for CREATE PUBLICATION note about the WHERE clause.

~~~

2. src/backend/commands/publicationcmds.c -
contain_mutable_or_ud_functions_checker

+/* check_functions_in_node callback */
+static bool
+contain_mutable_or_user_functions_checker(Oid func_id, void *context)
+{
+ return (func_volatile(func_id) != PROVOLATILE_IMMUTABLE ||
+ func_id >= FirstNormalObjectId);
+}

I was wondering why is the checking for user function and mutable
functions combined in one function like this. IMO it might be better
to have 2 "checker" callback functions instead of just one - then the
error messages can be split too so that only the relevant one is
displayed to the user.

BEFORE
contain_mutable_or_user_functions_checker --> "User-defined or
built-in mutable functions are not allowed."

AFTER
contain_user_functions_checker --> "User-defined functions are not allowed."
contain_mutable_function_checker --> "Built-in mutable functions are
not allowed."

~~~

3. src/backend/commands/publicationcmds.c - check_simple_rowfilter_expr_walker

+ case T_Const:
+ case T_FuncExpr:
+ case T_BoolExpr:
+ case T_RelabelType:
+ case T_CollateExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_ArrayExpr:
+ case T_CoalesceExpr:
+ case T_MinMaxExpr:
+ case T_XmlExpr:
+ case T_NullTest:
+ case T_BooleanTest:
+ case T_List:
+ break;

Perhaps a comment should be added here simply saying "OK, supported"
just to make it more obvious?

~~~

4. src/test/regress/sql/publication.sql - test comment

+-- fail - user-defined types disallowed

For consistency with the nearby comments it would be better to reword this one:
"fail - user-defined types are not allowed"

~~~

5. src/test/regress/sql/publication.sql - test for \d

+-- test \d+ (now it displays filter information)
+SET client_min_messages = 'ERROR';
+CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1
WHERE (a > 1) WITH (publish = 'insert');
+CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
+RESET client_min_messages;
+\d+ testpub_rf_tbl1

Actually, the \d (without "+") will also display filters but I don't
think that has been tested anywhere. So suggest updating the comment
and adding one more test

AFTER
-- test \d+ <tablename> and \d <tablename> (now these display filter
information)
...
\d+ testpub_rf_tbl1
\d testpub_rf_tbl1

~~~

6. src/test/regress/sql/publication.sql - tests for partitioned table

+-- Tests for partitioned table
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- ok - "a" is a OK col
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
+-- ok - partition does not have row filter
+UPDATE rf_tbl_abcd_part_pk SET a = 1;
+ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
+-- fail - "b" is not in REPLICA IDENTITY INDEX
+UPDATE rf_tbl_abcd_part_pk SET a = 1;

Those comments and the way the code is arranged did not make it very
clear to me what exactly these tests are doing.

I think all the changes to the publish_via_partition_root belong BELOW
those test comments don't they?
Also the same comment "-- ok - partition does not have row filter"
appears 2 times so that can be made more clear too.

e.g. IIUC it should be changed to something a bit like this (Note - I
did not change the SQL, I only moved it a bit and changed the
comments):

AFTER (??)
-- Tests for partitioned table
ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a > 99);

-- ok - PUBLISH_VIA_PARTITION_ROOT is false
-- Here the partition does not have a row filter
-- Col "a" is in replica identity.
ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
UPDATE rf_tbl_abcd_part_pk SET a = 1;

-- ok - PUBLISH_VIA_PARTITION_ROOT is true
-- Here the partition does not have a row filter, so the root filter
will be used.
-- Col "a" is in replica identity.
ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
UPDATE rf_tbl_abcd_part_pk SET a = 1;

-- Now change the root filter to use a column "b" (which is not in the
replica identity)
ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);

-- ok - PUBLISH_VIA_PARTITION_ROOT is false
-- Here the partition does not have a row filter
-- Col "a" is in replica identity.
ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
UPDATE rf_tbl_abcd_part_pk SET a = 1;

-- fail - PUBLISH_VIA_PARTITION_ROOT is true
-- Here the root filter will be used, but the "b" referenced in the
root filter is not in replica identiy.
ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
UPDATE rf_tbl_abcd_part_pk SET a = 1;

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-02-09 01:43:34 Re: pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR
Previous Message Tomas Vondra 2022-02-09 01:25:09 Re: pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR