RE: row filtering for logical replication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: row filtering for logical replication
Date: 2022-02-11 09:02:05
Message-ID: OS0PR01MB5716C373F6169B9570394EAA94309@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, February 10, 2022 6:18 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Feb 10, 2022 at 9:29 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> >
> > Attach the V80 patch which addressed the above comments and
> > comments from Amit[1].
> >
>
> Thanks for the new version. Few minor/cosmetic comments:

Thanks for the comments !

> 1. Can we slightly change the below comment:
> Before:
> + * To avoid fetching the publication information, we cache the publication
> + * actions and row filter validation information.
>
> After:
> To avoid fetching the publication information repeatedly, we cache the
> publication actions and row filter validation information.

Changed.

> 2.
> + /*
> + * For ordinary tables, make sure we don't copy data from child
> + * that inherits the named table.
> + */
> + if (lrel.relkind == RELKIND_RELATION)
> + appendStringInfoString(&cmd, " ONLY ");
>
> I think we should mention the reason why we are doing so. So how about
> something like: "For regular tables, make sure we don't copy data from
> a child that inherits the named table as those will be copied
> separately."

Changed.

> 3.
> Can we change the below comment?
>
> Before:
> + /*
> + * Initialize the tuple slot, map and row filter that are only used
> + * when publishing inserts, updates or deletes.
> + */
>
> After:
> Initialize the tuple slot, map, and row filter. These are only used
> when publishing inserts, updates, or deletes.

Changed.

> 4.
> +CREATE TABLE testpub_rf_tbl1 (a integer, b text);
> +CREATE TABLE testpub_rf_tbl2 (c text, d integer);
>
> Here, you can add a comment saying: "-- Test row filters" or something
> on those lines.

Changed.

> 5.
> +-- test \d+ (now it displays filter information)
> +SET client_min_messages = 'ERROR';
> +CREATE PUBLICATION testpub_dplus_rf_yes FOR TABLE testpub_rf_tbl1
> WHERE (a > 1) WITH (publish = 'insert');
> +CREATE PUBLICATION testpub_dplus_rf_no FOR TABLE testpub_rf_tbl1;
> +RESET client_min_messages;
> +\d+ testpub_rf_tbl1
> + Table "public.testpub_rf_tbl1"
> + Column | Type | Collation | Nullable | Default | Storage | Stats
> target | Description
> +--------+---------+-----------+----------+---------+----------+------------
> --+-------------
> + a | integer | | | | plain | |
> + b | text | | | | extended | |
> +Publications:
> + "testpub_dplus_rf_no"
> + "testpub_dplus_rf_yes" WHERE (a > 1)
>
> I think here \d is sufficient to show row filters? I think it is
> better to use table names such as testpub_rf_yes or testpub_rf_no in
> this test.

Changed.

> 6.
> +# Similarly, the table filter for tab_rf_x (after the initial phase) has no
> +# effect when combined with the ALL TABLES IN SCHEMA.
> +# Expected: 5 initial rows + 2 new rows = 7 rows
> +$node_publisher->safe_psql('postgres', "INSERT INTO tab_rf_x (x)
> VALUES (-99), (99)");
> +$node_publisher->wait_for_catchup($appname);
> +$result = $node_subscriber->safe_psql('postgres', "SELECT count(x)
> FROM tab_rf_x");
> +is($result, qq(7), 'check table tab_rf_x should not be filtered');
>
> I think the comment here should say "ALL TABLES." instead of "ALL
> TABLES IN SCHEMA." as there is no publication before this test which
> is created with "ALL TABLES IN SCHEMA" clause.

Changed.

> 7.
> +# The subscription of the ALL TABLES IN SCHEMA publication means
> there should be
> +# no filtering on the tablesync COPY, so all expect all 5 will be present.
>
> It doesn't make sense to use 'all' twice in the above comment, the
> first one can be removed.

Changed.

> 8.
> +
> +# setup structure on publisher
> +$node_publisher->safe_psql('postgres',
>
> I think it will be good if we can add some generic comments explaining
> the purpose of the tests following this. We can add "# Tests FOR TABLE
> with row filter publications" before the current comment.

Added.

> 9. For the newly added test for tab_rowfilter_inherited, the patch has
> a test case only for initial sync, can we add a test for replication
> after initial sync for the same?

Added.

Attach the v81 patch which addressed above comments.

Best regards,
Hou zj

Attachment Content-Type Size
v81-0001-Allow-specifying-row-filters-for-logical-replication.patch application/octet-stream 170.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-02-11 09:12:55 Re: unlogged sequences
Previous Message Simon Riggs 2022-02-11 08:48:43 Re: Race condition in TransactionIdIsInProgress