Re: row filtering for logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-21 10:03:07
Message-ID: CALDaNm1_JVg_hqoGex_FVca_HPF46n9oDDB9dsp1SrPuaVpp-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 21, 2021 at 2:29 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Here is the v51* patch set:
>
> Main changes from Euler's v50* are
> 1. Most of Euler's "fixes" patches are now merged back in
> 2. Patches are then merged per Amit's suggestion [Amit 20/12]
> 3. Some other review comments are addressed
>
> ~~
>
> Phase 1 - Merge the Euler fixes
> ===============================
>
> v51-0001 (main) <== v50-0001 (main 0001) + v50-0002 (fixes 0001)
> - OK, accepted and merged all the "fixes" back into the 0001
> - (fixed typos)
> - There is a slight disagreement with what the PG docs say about
> NULLs, but I will raise a separate comment on -hackers later
> (meanwhile, the current PG docs text is from the Euler patch)
>
> v51-0002 (validation) <== v50-0003 (validation 0002) + v50-0004 (fixes 0002)
> - OK, accepted and merges all these "fixes" back into the 0002
> - (fixed typo)
>
> v51-0003 (new/old) <== v50-0005 (new/old 0003)
> - REVERTED to the v49 version of this patch!
> - Please see Ajin's explanation why the v49 code was required [Ajin 21/12]
>
> v51-0004 (tab-complete + dump) <== v50-0006 (tab-complete + dump 0004)
> - No changes
>
> v51-0005 (for all tables) <== v50-0007 (for all tables 0005) +
> v50-0008 (fixes 0005)
> - OK, accepted and merged most of these "fixes" back into the 0005
> - (fixed typo/grammar)
> - Amit requested we not use Euler's new tablesync SQL just yet [Amit 21/12]
>
>
> Phase 2 - Merge main (Phase 1) patches per Amit suggestion
> ================================================
>
> v51-0001 (main) <== v51-0001 (main) + v51-0002 (validation) + v51-0005
> (for all tables)
> - (also combined all the commit comments)
>
> v51-0002 (new/old) <== v51-0003 (new/old)
>
> v51-0005 (tab-complete + dump) <== v51-0004
>
>
> Review comments (details)
> =========================
>
> v51-0001 (main)
> - Addressed review comments from Amit. [Amit 20/12] #1,#2,#3,#4
>
> v51-0002 (new/old tuple)
> - Includes a patch from Greg (provided internally)
>
> v51-0003 (tab, dump)
> - No changes
>
> ------
> [Amit 20/12] https://www.postgresql.org/message-id/CAA4eK1JJgfEPKYvQAcwGa5jjuiUiQRQZ0Pgo-HF0KFHh-jyNQQ%40mail.gmail.com
> [Ajin 21/12] https://www.postgresql.org/message-id/CAFPTHDbfpPNh3GLGjySS%3DAuRfbQPQFNvfiyG1GDQW2kz1yT7Og%40mail.gmail.com
> [Amit 21/12] https://www.postgresql.org/message-id/CAA4eK1KwoA5k8v9z9e4ZPN_X%3D1GAmQmsWyauFwZpKiSHqy6eZA%40mail.gmail.com

Few comments:
1) list_free(schemarelids) is called inside if and and outside if in
break case. Can we move it above continue so that it does not gets
called in the break case:
+ schemarelids =
GetAllSchemaPublicationRelations(pub->oid,
+
pub->pubviaroot ?
+
PUBLICATION_PART_ROOT
:
+

PUBLICATION_PART_LEAF);
+ if (list_member_oid(schemarelids, entry->relid))
+ {
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ list_free(schemarelids);
+
+ /* Quick exit loop if all pubactions
have no row-filter. */
+ if (no_filter[idx_ins] &&
no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;
+ }
+ list_free(schemarelids);

2) create_edata_for_relation also is using similar logic, can it also
call this function to reduce duplicate code
+static EState *
+create_estate_for_relation(Relation rel)
+{
+ EState *estate;
+ RangeTblEntry *rte;
+
+ estate = CreateExecutorState();
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->relkind = rel->rd_rel->relkind;
+ rte->rellockmode = AccessShareLock;
+ ExecInitRangeTable(estate, list_make1(rte));
+
+ estate->es_output_cid = GetCurrentCommandId(false);
+
+ return estate;
+}

3) In one place select is in lower case, it can be changed to upper case
+ resetStringInfo(&cmd);
+ appendStringInfo(&cmd,
+ "SELECT DISTINCT
pg_get_expr(prqual, prrelid) "
+ " FROM pg_publication p "
+ " INNER JOIN
pg_publication_rel pr ON (p.oid = pr.prpubid) "
+ " WHERE pr.prrelid
= %u AND p.pubname IN ( %s ) "
+ " AND NOT (select
bool_or(puballtables) "
+ " FROM pg_publication "
+ " WHERE pubname
in ( %s )) "
+ " AND (SELECT count(1)=0 "
+ " FROM
pg_publication_namespace pn, pg_class c "
+ " WHERE c.oid =
%u AND c.relnamespace = pn.pnnspid)",
+ lrel->remoteid,
+ pub_names.data,
+ pub_names.data,
+ lrel->remoteid);

4) we could run pgindent once to fix indentation issues
+ /* Cache ExprState using CacheMemoryContext. */
+ Assert(CurrentMemoryContext = CacheMemoryContext);
+
+ /* Prepare expression for execution */
+ exprtype = exprType(rfnode);
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype,
BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)

5) Should we mention user should take care of deletion of row filter
records after table sync is done.
+ ALL TABLES</literal> or <literal>FOR ALL TABLES IN
SCHEMA</literal> publication,
+ then all other <literal>WHERE</literal> clauses (for the same
publish operation)
+ become redundant.
+ If the subscriber is a <productname>PostgreSQL</productname>
version before 15
+ then any row filtering is ignored during the initial data
synchronization phase.

6) Should this be an Assert, since this will be taken care earlier by
GetTransformedWhereClause->transformWhereClause->coerce_to_boolean:
+ expr = (Expr *) coerce_to_target_type(NULL, rfnode, exprtype,
BOOLOID, -1, COERCION_ASSIGNMENT, COERCE_IMPLICIT_CAST, -1);
+
+ if (expr == NULL)
+ ereport(ERROR,
+ (errcode(ERRCODE_CANNOT_COERCE),
+ errmsg("row filter returns type %s
that cannot be cast to the expected type %s",
+ format_type_be(exprtype),
+ format_type_be(BOOLOID)),
+ errhint("You will need to rewrite the
row filter.")));

7) This code is present in 3 places, can we make it a function or
macro and use it:
+ if (pub->pubactions.pubinsert)
+ no_filter[idx_ins] = true;
+ if (pub->pubactions.pubupdate)
+ no_filter[idx_upd] = true;
+ if (pub->pubactions.pubdelete)
+ no_filter[idx_del] = true;
+
+ /* Quick exit loop if all pubactions
have no row-filter. */
+ if (no_filter[idx_ins] &&
no_filter[idx_upd] && no_filter[idx_del])
+ break;
+
+ continue;

8) Can the below transformation be done just before the
values[Anum_pg_publication_rel_prqual - 1] is set to reduce one of the
checks:
+ if (pri->whereClause != NULL)
+ {
+ /* Set up a ParseState to parse with */
+ pstate = make_parsestate(NULL);
+
+ /*
+ * Get the transformed WHERE clause, of boolean type,
with necessary
+ * collation information.
+ */
+ whereclause = GetTransformedWhereClause(pstate, pri, true);
+ }

/* Form a tuple. */
memset(values, 0, sizeof(values));
@@ -328,6 +376,12 @@ publication_add_relation(Oid pubid,
PublicationRelInfo *targetrel,
values[Anum_pg_publication_rel_prrelid - 1] =
ObjectIdGetDatum(relid);

+ /* Add qualifications, if available */
+ if (whereclause)
+ values[Anum_pg_publication_rel_prqual - 1] =
CStringGetTextDatum(nodeToString(whereclause));
+ else
+ nulls[Anum_pg_publication_rel_prqual - 1] = true;
+

Like:
/* Add qualifications, if available */
if (pri->whereClause != NULL)
{
/* Set up a ParseState to parse with */
pstate = make_parsestate(NULL);

/*
* Get the transformed WHERE clause, of boolean type, with necessary
* collation information.
*/
whereclause = GetTransformedWhereClause(pstate, pri, true);

/*
* Walk the parse-tree of this publication row filter expression and
* throw an error if anything not permitted or unexpected is
* encountered.
*/
rowfilter_walker(whereclause, targetrel);
values[Anum_pg_publication_rel_prqual - 1] =
CStringGetTextDatum(nodeToString(whereclause));
}
else
nulls[Anum_pg_publication_rel_prqual - 1] = true;

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-12-21 11:04:55 Re: In-placre persistance change of a relation
Previous Message Greg Nancarrow 2021-12-21 08:59:34 Re: Failed transaction statistics to measure the logical replication progress