Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-11-18 05:38:27
Message-ID: CAHut+PtCr6kgW6Z2JpX6FJFbxA+o9Lm6B4Yqja_xMWE1ie5kdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 15, 2021 at 12:01 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > Attaching version 39-
>
> Here are some review comments for v39-0006:
>
> 1)
> @@ -261,9 +261,9 @@ rowfilter_expr_replident_walker(Node *node,
> rf_context *context)
> * Rule 1. Walk the parse-tree and reject anything other than very simple
> * expressions (See rowfilter_validator for details on what is permitted).
> *
> - * Rule 2. If the publish operation contains "delete" then only columns that
> - * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
> - * row-filter WHERE clause.
> + * Rule 2. If the publish operation contains "delete" or "delete" then only
> + * columns that are allowed by the REPLICA IDENTITY rules are permitted to
> + * be used in the row-filter WHERE clause.
> */
> static void
> rowfilter_expr_checker(Publication *pub, ParseState *pstate, Node
> *rfnode, Relation rel)
> @@ -276,12 +276,10 @@ rowfilter_expr_checker(Publication *pub,
> ParseState *pstate, Node *rfnode, Relat
> rowfilter_validator(relname, rfnode);
>
> /*
> - * Rule 2: For "delete", check that filter cols are also valid replica
> + * Rule 2: For "delete" and "update", check that filter cols are also
> valid replica
> * identity cols.
> - *
> - * TODO - check later for publish "update" case.
> */
> - if (pub->pubactions.pubdelete)
>
> 1a)
> Typo - the function comment: "delete" or "delete"; should say:
> "delete" or "update"
>
> 1b)
> I felt it would be better (for the comment in the function body) to
> write it as "or" instead of "and" because then it matches with the
> code "if ||" that follows this comment.
>
> ====
>
> 2)
> @@ -746,6 +780,92 @@ logicalrep_read_typ(StringInfo in, LogicalRepTyp *ltyp)
> }
>
> /*
> + * Write a tuple to the outputstream using cached slot, in the most
> efficient format possible.
> + */
> +static void
> +logicalrep_write_tuple_cached(StringInfo out, Relation rel,
> TupleTableSlot *slot, bool binary)
>
> The function logicalrep_write_tuple_cached seems to have almost all of
> its function body in common with logicalrep_write_tuple. Is there any
> good way to combine these functions to avoid ~80 lines mostly
> duplicated code?
>
> ====
>
> 3)
> + if (!old_matched && !new_matched)
> + return false;
> +
> + if (old_matched && new_matched)
> + *action = REORDER_BUFFER_CHANGE_UPDATE;
> + else if (old_matched && !new_matched)
> + *action = REORDER_BUFFER_CHANGE_DELETE;
> + else if (new_matched && !old_matched)
> + *action = REORDER_BUFFER_CHANGE_INSERT;
> +
> + return true;
>
> I felt it is slightly confusing to have inconsistent ordering of the
> old_matched and new_matched in those above conditions.
>
> I suggest to use the order like:
> * old-row (no match) new-row (no match)
> * old-row (no match) new row (match)
> * old-row (match) new-row (no match)
> * old-row (match) new row (match)
>
> And then be sure to keep consistent ordering in all places it is mentioned:
> * in the code
> * in the function header comment
> * in the commit comment
> * in docs?
>
> ====
>
> 4)
> +/*
> + * Change is checked against the row filter, if any.
> + *
> + * If it returns true, the change is replicated, otherwise, it is not.
> + */
> +static bool
> +pgoutput_row_filter_virtual(Relation relation, TupleTableSlot *slot,
> RelationSyncEntry *entry)
> +{
> + EState *estate;
> + ExprContext *ecxt;
> + bool result = true;
> + Oid relid = RelationGetRelid(relation);
> +
> + /* Bail out if there is no row filter */
> + if (!entry->exprstate)
> + return true;
> +
> + elog(DEBUG3, "table \"%s.%s\" has row filter",
> + get_namespace_name(get_rel_namespace(relid)),
> + get_rel_name(relid));
>
> It seems like that elog may consume unnecessary CPU most of the time.
> I think it might be better to remove the relid declaration and rewrite
> that elog as:
>
> if (message_level_is_interesting(DEBUG3))
> elog(DEBUG3, "table \"%s.%s\" has row filter",
> get_namespace_name(get_rel_namespace(entry->relid)),
> get_rel_name(entry->relid));
>
> ====
>
> 5)
> diff --git a/src/include/replication/reorderbuffer.h
> b/src/include/replication/reorderbuffer.h
> index 5b40ff7..aec0059 100644
> --- a/src/include/replication/reorderbuffer.h
> +++ b/src/include/replication/reorderbuffer.h
> @@ -51,7 +51,7 @@ typedef struct ReorderBufferTupleBuf
> * respectively. They're used by INSERT .. ON CONFLICT .. UPDATE. Users of
> * logical decoding don't have to care about these.
> */
> -enum ReorderBufferChangeType
> +typedef enum ReorderBufferChangeType
> {
> REORDER_BUFFER_CHANGE_INSERT,
> REORDER_BUFFER_CHANGE_UPDATE,
> @@ -65,7 +65,7 @@ enum ReorderBufferChangeType
> REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
> REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
> REORDER_BUFFER_CHANGE_TRUNCATE
> -};
> +} ReorderBufferChangeType;
>
> This new typedef can be added to src/tools/pgindent/typedefs.list.
>

All above are fixed by Ajin Cherian in V40-0006 [1].

-----
[1] https://www.postgresql.org/message-id/CAHut%2BPv-D4rQseRO_OzfEz2dQsTKEnKjBCET9Z-iJppyT1XNMQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-11-18 05:46:30 Re: row filtering for logical replication
Previous Message Peter Smith 2021-11-18 05:35:48 Re: row filtering for logical replication