RE: row filtering for logical replication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: row filtering for logical replication
Date: 2022-02-10 03:59:16
Message-ID: OS0PR01MB5716B2A4901563E44637B7E2942F9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, February 9, 2022 9:37 AM Peter Smith <smithpb2250(at)gmail(dot)com>
>
> I did a review of the v79 patch. Below are my review comments:
>

Thanks for the comments!

>
> 1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION
>
> The commit message for v79-0001 says:
> <quote>
> If your publication contains a partitioned table, the publication parameter
> publish_via_partition_root determines if it uses the partition row filter (if
> the parameter is false, the default) or the root partitioned table row filter.
> </quote>
>
> I think that the same information should also be mentioned in the PG
> DOCS for CREATE PUBLICATION note about the WHERE clause.
>

Added this to the document.

>
> 2. src/backend/commands/publicationcmds.c -
> contain_mutable_or_ud_functions_checker
>
> +/* check_functions_in_node callback */
> +static bool
> +contain_mutable_or_user_functions_checker(Oid func_id, void *context)
> +{
> + return (func_volatile(func_id) != PROVOLATILE_IMMUTABLE ||
> + func_id >= FirstNormalObjectId);
> +}
>
> I was wondering why is the checking for user function and mutable
> functions combined in one function like this. IMO it might be better
> to have 2 "checker" callback functions instead of just one - then the
> error messages can be split too so that only the relevant one is
> displayed to the user.
>
> BEFORE
> contain_mutable_or_user_functions_checker --> "User-defined or
> built-in mutable functions are not allowed."
>
> AFTER
> contain_user_functions_checker --> "User-defined functions are not allowed."
> contain_mutable_function_checker --> "Built-in mutable functions are
> not allowed."

As Amit mentioned, I didn’t change this.

>
> 3. src/backend/commands/publicationcmds.c -
> check_simple_rowfilter_expr_walker
>
> + case T_Const:
> + case T_FuncExpr:
> + case T_BoolExpr:
> + case T_RelabelType:
> + case T_CollateExpr:
> + case T_CaseExpr:
> + case T_CaseTestExpr:
> + case T_ArrayExpr:
> + case T_CoalesceExpr:
> + case T_MinMaxExpr:
> + case T_XmlExpr:
> + case T_NullTest:
> + case T_BooleanTest:
> + case T_List:
> + break;
>
> Perhaps a comment should be added here simply saying "OK, supported"
> just to make it more obvious?

Added.

>
> 4. src/test/regress/sql/publication.sql - test comment
>
> +-- fail - user-defined types disallowed
>
> For consistency with the nearby comments it would be better to reword this
> one:
> "fail - user-defined types are not allowed"

Changed.

>
> 5. src/test/regress/sql/publication.sql - test for \d
>
> +-- test \d+ (now it displays filter information)
> +SET client_min_messages = 'ERROR';
> +CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1
> WHERE (a > 1) WITH (publish = 'insert');
> +CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
> +RESET client_min_messages;
> +\d+ testpub_rf_tbl1
>
> Actually, the \d (without "+") will also display filters but I don't
> think that has been tested anywhere. So suggest updating the comment
> and adding one more test
>
> AFTER
> -- test \d+ <tablename> and \d <tablename> (now these display filter
> information)
> ...
> \d+ testpub_rf_tbl1
> \d testpub_rf_tbl1

Changed.

> 6. src/test/regress/sql/publication.sql - tests for partitioned table
>
> +-- Tests for partitioned table
> +ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (a >
> 99);
> +ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
> +-- ok - partition does not have row filter
> +UPDATE rf_tbl_abcd_part_pk SET a = 1;
> +ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
> +-- ok - "a" is a OK col
> +UPDATE rf_tbl_abcd_part_pk SET a = 1;
> +ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b >
> 99);
> +ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=0);
> +-- ok - partition does not have row filter
> +UPDATE rf_tbl_abcd_part_pk SET a = 1;
> +ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
> +-- fail - "b" is not in REPLICA IDENTITY INDEX
> +UPDATE rf_tbl_abcd_part_pk SET a = 1;
>
> Those comments and the way the code is arranged did not make it very
> clear to me what exactly these tests are doing.
>
> I think all the changes to the publish_via_partition_root belong BELOW
> those test comments don't they?
> Also the same comment "-- ok - partition does not have row filter"
> appears 2 times so that can be made more clear too.
>
> e.g. IIUC it should be changed to something a bit like this (Note - I
> did not change the SQL, I only moved it a bit and changed the
> comments):
>

I think it might be better to put "-- ok" and "-- fail" before the DML as we
are testing the RI invalidation of DML here. But I added some comments
here to make it clearer.

Attach the V80 patch which addressed the above comments and
comments from Amit[1].

I also adjusted some code comments in the patch and fix the following
problems about inherited table:

- When subscriber is doing initial copy with row filter it will use "COPY
(SELECT ..) TO ..". If the target table is inherited parent table, SELECT
command will copy data from both the parent and child while we only need to
copy the parent table's data. So, Added a "ONLY" in this case to fix it.
- We didn't check the duplicate whereclause when speicifing both inherited
parent and child table with row filter in CRAETE PUBLICATION/ALTER
PUBLICATION. When adding a parent table we will also add all its child to the
list, so we need to check here if user already speicify the child with row
filter and report an error if yes.

Besides, added support for node RowExpr in row filter and added some testcases.

[1] https://www.postgresql.org/message-id/CAA4eK1JkXwu-dvOqEojnKUEZr2dXTLwz_QkQ5uJbmjiHs%3Dg0KQ%40mail.gmail.com

Best regards,
Hou zj

Attachment Content-Type Size
v80-0001-Allow-specifying-row-filters-for-logical-replication.patch application/octet-stream 169.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-02-10 04:21:29 Re: Move replication slot structures/enums/macros to a new header file for better usability by external tools/modules
Previous Message Fujii Masao 2022-02-10 03:37:10 Re: [PATCH] Add min() and max() aggregate functions for xid8