RE: row filtering for logical replication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: row filtering for logical replication
Date: 2022-02-08 02:31:02
Message-ID: OS0PR01MB571648124C5BAE8B795CEDD9942D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, February 7, 2022 3:51 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi - I did a review of the v77 patches merged with Amit's v77 diff patch [1].
>
> (Maybe this is equivalent to reviewing v78)
>
> Below are my review comments:

Thanks for the comments!

> ======
>
> 1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION
>
> + The <literal>WHERE</literal> clause allows simple expressions that
> don't have
> + user-defined functions, operators, collations, non-immutable built-in
> + functions, or references to system columns.
> + </para>
>
> That seems slightly ambiguous for operators and collations. It's only
> the USER-DEFINED ones we don't support.
>
> Perhaps it should be worded like:
>
> "allows simple expressions that don't have user-defined functions,
> user-defined operators, user-defined collations, non-immutable
> built-in functions..."

Changed.

>
> 2. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication
>
> +Oid
> +GetTopMostAncestorInPublication(Oid puboid, List *ancestors)
> +{
> + ListCell *lc;
> + Oid topmost_relid = InvalidOid;
> +
> + /*
> + * Find the "topmost" ancestor that is in this publication.
> + */
> + foreach(lc, ancestors)
> + {
> + Oid ancestor = lfirst_oid(lc);
> + List *apubids = GetRelationPublications(ancestor);
> + List *aschemaPubids = NIL;
> +
> + if (list_member_oid(apubids, puboid))
> + topmost_relid = ancestor;
> + else
> + {
> + aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
> + if (list_member_oid(aschemaPubids, puboid))
> + topmost_relid = ancestor;
> + }
> +
> + list_free(apubids);
> + list_free(aschemaPubids);
> + }
> +
> + return topmost_relid;
> +}
>
> Wouldn't it be better for the aschemaPubids to be declared and freed
> inside the else block?

I personally think the current code is clean and the code was borrowed from
Greg's comment[1]. So, I didn't change this.

[1] https://www.postgresql.org/message-id/CAJcOf-c2%2BWbjeP7NhwgcAEtsn9KdDnhrsowheafbZ9%2BQU9C8SQ%40mail.gmail.com

>
> 3. src/backend/commands/publicationcmds.c - contain_invalid_rfcolumn
>
> + if (pubviaroot && relation->rd_rel->relispartition)
> + {
> + publish_as_relid = GetTopMostAncestorInPublication(pubid, ancestors);
> +
> + if (publish_as_relid == InvalidOid)
> + publish_as_relid = relid;
> + }
>
> Consider using the macro code for the InvalidOid check. e.g.
>
> if (!OidIsValid(publish_as_relid)
> publish_as_relid = relid;
>

Changed.

>
> 4. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Tests)
>
> + switch (nodeTag(node))
> + {
> + case T_ArrayExpr:
> + case T_BooleanTest:
> + case T_BoolExpr:
> + case T_CaseExpr:
> + case T_CaseTestExpr:
> + case T_CoalesceExpr:
> + case T_CollateExpr:
> + case T_Const:
> + case T_FuncExpr:
> + case T_MinMaxExpr:
> + case T_NullTest:
> + case T_RelabelType:
> + case T_XmlExpr:
> + return true;
> + default:
> + return false;
> + }
>
> I think there are several missing regression tests.
>
> 4a. There is a new message that says "User-defined collations are not
> allowed." but I never saw any test case for it.
>
> 4b. There is also the RelabelType which seems to have no test case.
> Amit previously provided [2] some SQL which would give an unexpected
> error, so I guess that should be a new regression test case. e.g.
> create table t1(c1 int, c2 varchar(100));
> create publication pub1 for table t1 where (c2 < 'john');

I added some tests to cover these nodes.

>
> 5. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr
> (Simple?)
>
> +/*
> + * Is this a simple Node permitted within a row filter expression?
> + */
> +static bool
> +IsRowFilterSimpleExpr(Node *node)
> +{
>
> A lot has changed in this area recently and I feel that there is
> something not quite 100% right with the naming and/or logic in this
> expression validation. IMO there are several functions that seem to
> depend too much on each other in special ways...
>
> IIUC the "walker" logic now seems to be something like this:
> a) Check for special cases of the supported nodes
> b) Then check for supported (simple?) nodes (i.e.
> IsRowFilterSimpleExpr is now acting as a "catch-all" after the special
> case checks)
> c) Then check for some unsupported node embedded within a supported
> node (i.e. call expr_allowed_in_node)
> d) If any of a,b,c was bad then give an error.
>
> To achieve that logic the T_FuncExpr was added to the
> "IsRowFilterSimpleExpr". Meanwhile, other nodes like
> T_ScalarArrayOpExpr and T_NullIfExpr now are removed from
> IsRowFilterSimpleExpr - I don't quite know why these got removed but
> perhaps there is implicit knowledge that those node kinds were already
> checked by the "walker" before the IsRowFilterSimpleExpr function ever
> gets called.
>
> So, although I trust that everything is working OK, I don't think
> IsRowFilterSimpleExpr is really just about simple nodes anymore. It is
> harder to see why some supported nodes are in there, and some
> supported nodes are not. It seems tightly entwined with the logic of
> check_simple_rowfilter_expr_walker; i.e. there seem to be assumptions
> about exactly when it will be called and what was checked before and
> what will be checked after calling it.
>
> IMO probably all the nodes we are supporting should be in the
> IsRowFilterSimpleExpr just for completeness (e.g. put T_NullIfExpr and
> T_ScalarArrayOpExpr back in there...), and maybe the function should
> be renamed (IsRowFilterAllowedNode?), and probably there need to be
> more comments describing the validation logic (e.g. the a/b/c/d logic
> I mentioned above).

I adjusted these codes by moving all the move back all the nodes handled in
IsRowFilterSimpleExpr back to check_simple_rowfilter_expr_walker() and change
the handling to switch..case.

>
> 6. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (T_List)
>
> (From Amit's patch)
>
> @@ -395,6 +397,7 @@ IsRowFilterSimpleExpr(Node *node)
> case T_NullTest:
> case T_RelabelType:
> case T_XmlExpr:
> + case T_List:
> return true;
> default:
> return false;
>
>
> The case T_List should be moved to be alphabetical the same as all the
> other cases.

I reordered these referring to the order as they are defined in nodes.h.

>
> 7. src/backend/commands/publicationcmds.c -
> contain_mutable_or_ud_functions_checker
>
> +/* check_functions_in_node callback */
> +static bool
> +contain_mutable_or_ud_functions_checker(Oid func_id, void *context)
>
> "ud" seems a strange name. Maybe better to name this function
> "contain_mutable_or_user_functions_checker" ?
>

Changed.

>
> 8. src/backend/commands/publicationcmds.c - expr_allowed_in_node
> (comment)
>
> (From Amit's patch)
>
> @@ -410,6 +413,37 @@ contain_mutable_or_ud_functions_checker(Oid
> func_id, void *context)
> }
>
> /*
> + * Check, if the node contains any unallowed object in node. See
> + * check_simple_rowfilter_expr_walker.
> + *
> + * Returns the error detail meesage in errdetail_msg for unallowed
> expressions.
> + */
> +static bool
> +expr_allowed_in_node(Node *node, ParseState *pstate, char
> **errdetail_msg)
>
> Remove the comma: "Check, if ..." --> "Check if ..."
> Typo: "meesage" --> "message"
>

Changed.

>
> 9. src/backend/commands/publicationcmds.c - expr_allowed_in_node (else)
>
> (From Amit's patch)
>
> + if (exprType(node) >= FirstNormalObjectId)
> + *errdetail_msg = _("User-defined types are not allowed.");
> + if (check_functions_in_node(node,
> contain_mutable_or_ud_functions_checker,
> + (void*) pstate))
> + *errdetail_msg = _("User-defined or built-in mutable functions are
> not allowed.");
> + else if (exprCollation(node) >= FirstNormalObjectId)
> + *errdetail_msg = _("User-defined collations are not allowed.");
> + else if (exprInputCollation(node) >= FirstNormalObjectId)
> + *errdetail_msg = _("User-defined collations are not allowed.");
>
> Is that correct - isn't there a missing "else" on the 2nd "if"?
>

Changed.

>
> 10. src/backend/commands/publicationcmds.c - expr_allowed_in_node (bool)
>
> (From Amit's patch)
>
> +static bool
> +expr_allowed_in_node(Node *node, ParseState *pstate, char
> **errdetail_msg)
>
> Why is this a boolean function? It can never return false (??)
>

Changed.

>
> 11. src/backend/commands/publicationcmds.c -
> check_simple_rowfilter_expr_walker (else)
>
> (From Amit's patch)
>
> @@ -500,12 +519,18 @@ check_simple_rowfilter_expr_walker(Node *node,
> ParseState *pstate)
> }
> }
> }
> - else if (!IsRowFilterSimpleExpr(node))
> + else if (IsRowFilterSimpleExpr(node))
> + {
> + }
> + else
> {
> elog(DEBUG3, "row filter contains an unexpected expression
> component: %s", nodeToString(node));
> errdetail_msg = _("Expressions only allow columns, constants,
> built-in operators, built-in data types, built-in collations and
> immutable built-in functions.");
> }
>
> Why introduce a new code block that does nothing?
>

Changed it to switch ... case which don’t have this problem.

>
> 12. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry
>
> + /*
> + * Initialize the row filter after getting the final publish_as_relid
> + * as we only evaluate the row filter of the relation which we publish
> + * change as.
> + */
> + pgoutput_row_filter_init(data, active_publications, entry);
>
> The comment "which we publish change as" seems strangely worded.
>
> Perhaps it should be:
> "... only evaluate the row filter of the relation which being published."

Changed.

>
> 13. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> (release)
>
> + /*
> + * Check if all columns referenced in the filter expression are part of
> + * the REPLICA IDENTITY index or not.
> + *
> + * If the publication is FOR ALL TABLES then it means the table has no
> + * row filters and we can skip the validation.
> + */
> + if (!pubform->puballtables &&
> + (pubform->pubupdate || pubform->pubdelete) &&
> + contain_invalid_rfcolumn(pubid, relation, ancestors,
> + pubform->pubviaroot))
> + {
> + if (pubform->pubupdate)
> + pubdesc->rf_valid_for_update = false;
> + if (pubform->pubdelete)
> + pubdesc->rf_valid_for_delete = false;
> + }
>
> ReleaseSysCache(tup);
>
> This change has the effect of moving the location of the
> "ReleaseSysCache(tup);" to much lower in the code but I think there is
> no point to move it for the Row Filter patch, so it should be left
> where it was before.

The newly added code here refers to the 'pubform' which comes from the ' tup',
So I think we should release the tuple after these codes.

>
> 14. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
> (if refactor)
>
> - if (pubactions->pubinsert && pubactions->pubupdate &&
> - pubactions->pubdelete && pubactions->pubtruncate)
> + if (pubdesc->pubactions.pubinsert && pubdesc->pubactions.pubupdate
> &&
> + pubdesc->pubactions.pubdelete && pubdesc->pubactions.pubtruncate
> &&
> + !pubdesc->rf_valid_for_update && !pubdesc->rf_valid_for_delete)
> break;
>
> I felt that the "rf_valid_for_update" and "rf_valid_for_delete" should
> be checked first in that if condition. It is probably more optimal to
> move them because then it can bail out early. All those other
> pubaction flags are more likely to be true most of the time (because
> that is the default case).

I don't have a strong opinion on this, I feel it's fine to put the newly added
check at the end as it doesn't bring notable performance impact.

>
> 15. src/bin/psql/describe.c - SQL format
>
> @@ -2898,12 +2902,12 @@ describeOneTableDetails(const char
> *schemaname,
> else
> {
> printfPQExpBuffer(&buf,
> - "SELECT pubname\n"
> + "SELECT pubname, NULL\n"
> "FROM pg_catalog.pg_publication p\n"
> "JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
> "WHERE pr.prrelid = '%s'\n"
> "UNION ALL\n"
> - "SELECT pubname\n"
> + "SELECT pubname, NULL\n"
> "FROM pg_catalog.pg_publication p\n"
>
> I thought it may be better to reformat to put the NULL columns on a
> different line for consistent format with the other SQL just above
> this one. e.g.
>
> printfPQExpBuffer(&buf,
> "SELECT pubname\n"
> + " , NULL\n"

Changed.

Attach the V79 patch set which addressed the above comments and adjust some
comments related to expression check.

Best regards,
Hou zj

Attachment Content-Type Size
v79-0001-Allow-specifying-row-filters-for-logical-replication.patch application/octet-stream 164.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-02-08 02:40:53 Re: [RFC] building postgres with meson - perl embedding
Previous Message Julien Rouhaud 2022-02-08 02:10:50 Add tag/category to the commitfest app