Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2022-01-06 22:30:13
Message-ID: CAHut+Pv1en=ZSFvhOfx4r2rAnHps7ELDYkbA0GKD4XTn8ht08w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 5, 2022 at 4:34 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> I have reviewed again the source code for v58-0001.
>
> Below are my review comments.
>
> Actually, I intend to fix most of these myself for v59*, so this post
> is just for records.
>
> v58-0001 Review Comments
> ========================
>
> 1. doc/src/sgml/ref/alter_publication.sgml - reword for consistency
>
> + name to explicitly indicate that descendant tables are included. If the
> + optional <literal>WHERE</literal> clause is specified, rows that do not
> + satisfy the <replaceable class="parameter">expression</replaceable> will
> + not be published. Note that parentheses are required around the
>
> For consistency, it would be better to reword this sentence about the
> expression to be more similar to the one in CREATE PUBLICATION, which
> now says:
>
> + If the optional <literal>WHERE</literal> clause is specified, rows for
> + which the <replaceable class="parameter">expression</replaceable> returns
> + false or null will not be published. Note that parentheses are required
> + around the expression. It has no effect on <literal>TRUNCATE</literal>
> + commands.
>

Updated in v59* [1]

> ~~
>
> 2. doc/src/sgml/ref/create_subscription.sgml - reword for consistency
>
> @@ -319,6 +324,25 @@ CREATE SUBSCRIPTION <replaceable
> class="parameter">subscription_name</replaceabl
> the parameter <literal>create_slot = false</literal>. This is an
> implementation restriction that might be lifted in a future release.
> </para>
> +
> + <para>
> + If any table in the publication has a <literal>WHERE</literal> clause, rows
> + that do not satisfy the <replaceable
> class="parameter">expression</replaceable>
> + will not be published. If the subscription has several publications in which
>
> For consistency, it would be better to reword this sentence about the
> expression to be more similar to the one in CREATE PUBLICATION, which
> now says:
>
> + If the optional <literal>WHERE</literal> clause is specified, rows for
> + which the <replaceable class="parameter">expression</replaceable> returns
> + false or null will not be published. Note that parentheses are required
> + around the expression. It has no effect on <literal>TRUNCATE</literal>
> + commands.
>

Updated in v59* [1]

> ~~
>
> 3. src/backend/catalog/pg_publication.c - whitespace
>
> +rowfilter_walker(Node *node, Relation relation)
> +{
> + char *errdetail_msg = NULL;
> +
> + if (node == NULL)
> + return false;
> +
> +
> + if (IsRowFilterSimpleExpr(node))
>
> Remove the extra blank line.
>

Fixed in v59* [1]

> ~~
>
> 4. src/backend/executor/execReplication.c - move code
>
> + bad_rfcolnum = GetRelationPublicationInfo(rel, true);
> +
> + /*
> + * It is only safe to execute UPDATE/DELETE when all columns referenced in
> + * the row filters from publications which the relation is in are valid,
> + * which means all referenced columns are part of REPLICA IDENTITY, or the
> + * table do not publish UPDATES or DELETES.
> + */
> + if (AttributeNumberIsValid(bad_rfcolnum))
>
> I felt that the bad_rfcolnum assignment belongs below the large
> comment explaining this logic.
>

Fixed in v59* [1]

> ~~
>
> 5. src/backend/executor/execReplication.c - fix typo
>
> + /*
> + * It is only safe to execute UPDATE/DELETE when all columns referenced in
> + * the row filters from publications which the relation is in are valid,
> + * which means all referenced columns are part of REPLICA IDENTITY, or the
> + * table do not publish UPDATES or DELETES.
> + */
>
> Typo: "table do not publish" -> "table does not publish"
>

Fixed in v59* [1]

> ~~
>
> 6. src/backend/replication/pgoutput/pgoutput.c - fix typo
>
> + oldctx = MemoryContextSwitchTo(CacheMemoryContext);
> + /* Gather the rfnodes per pubaction of this publiaction. */
> + if (pub->pubactions.pubinsert)
>
> Typo: "publiaction" --> "publication"
>

Fixed in v59* [1]

> ~~
>
> 7. src/backend/utils/cache/relcache.c - fix comment case
>
> @@ -267,6 +271,19 @@ typedef struct opclasscacheent
>
> static HTAB *OpClassCache = NULL;
>
> +/*
> + * Information used to validate the columns in the row filter expression. see
> + * rowfilter_column_walker for details.
> + */
>
> Typo: "see" --> "See"
>

Fixed in v59* [1]

> ~~
>
> 8. src/backend/utils/cache/relcache.c - "row-filter"
>
> For consistency with all other naming change all instances of
> "row-filter" to "row filter" in this file.
>

Fixed in v59* [1]

> ~~
>
> 9. src/backend/utils/cache/relcache.c - fix typo
>

Fixed in v59* [1]

> ~~
>
> 10. src/backend/utils/cache/relcache.c - comment confused wording?
>
> Function GetRelationPublicationInfo:
>
> + /*
> + * For a partition, if pubviaroot is true, check if any of the
> + * ancestors are published. If so, note down the topmost ancestor
> + * that is published via this publication, the row filter
> + * expression on which will be used to filter the partition's
> + * changes. We could have got the topmost ancestor when collecting
> + * the publication oids, but that will make the code more
> + * complicated.
> + */
>
> Typo: Probably "on which' --> "of which" ?
>

Fixed in v59* [1]

> ~~
>
> 11. src/backend/utils/cache/relcache.c - GetRelationPublicationActions
>
> Something seemed slightly fishy with the code doing the memcpy,
> because IIUC is possible for the GetRelationPublicationInfo function
> to return without setting the relation->rd_pubactions. Is it just
> missing an Assert or maybe a comment to say such a scenario is not
> possible in this case because the is_publishable_relation was already
> tested?
>
> Currently, it just seems a little bit too sneaky.
>

TODO

> ~~
>
> 12. src/include/parser/parse_node.h - This change is unrelated to row-filtering.
>
> @@ -79,7 +79,7 @@ typedef enum ParseExprKind
> EXPR_KIND_CALL_ARGUMENT, /* procedure argument in CALL */
> EXPR_KIND_COPY_WHERE, /* WHERE condition in COPY FROM */
> EXPR_KIND_GENERATED_COLUMN, /* generation expression for a column */
> - EXPR_KIND_CYCLE_MARK, /* cycle mark value */
> + EXPR_KIND_CYCLE_MARK /* cycle mark value */
> } ParseExprKind;
>
> This change is unrelated to Row-Filtering so ought to be removed from
> this patch. Soon I will post a separate thread to fix this
> independently on HEAD.
>

Fixed in v59* [1].

I started a separate thread for this problem.
See https://www.postgresql.org/message-id/flat/CAHut%2BPsqr93nng7diTXxtUD636u7ytA%3DMq2duRphs0CBzpfDTA%40mail.gmail.com

> ~~
>
> 13. src/include/utils/rel.h - comment typos
>
> @@ -164,6 +164,13 @@ typedef struct RelationData
> PublicationActions *rd_pubactions; /* publication actions */
>
> /*
> + * true if the columns referenced in row filters from all the publications
> + * the relation is in are part of replica identity, or the publication
> + * actions do not include UPDATE and DELETE.
> + */
>
> Some minor rewording of the comment:
>
> "true" --> "True".
> "part of replica identity" --> "part of the replica identity"
> "UPDATE and DELETE" --> "UPDATE or DELETE"
>

Fixed in v59* [1]

------
[1] https://www.postgresql.org/message-id/CAHut%2BPsiw9fbOUTpCMWirut1ZD5hbWk8_U9tZya4mG-YK%2Bfq8g%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2022-01-06 22:39:14 Re: row filtering for logical replication
Previous Message Peter Smith 2022-01-06 22:23:53 Re: row filtering for logical replication