Re: row filtering for logical replication

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Peter Smith" <smithpb2250(at)gmail(dot)com>
Cc: "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>, "Rahila Syed" <rahilasyed90(at)gmail(dot)com>, "Peter Eisentraut" <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "David Steele" <david(at)pgmasters(dot)net>, "Craig Ringer" <craig(at)2ndquadrant(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Amit Langote" <amitlangote09(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-13 20:37:55
Message-ID: c56e001e-313a-4056-a289-4ea33da92322@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 13, 2021, at 12:25 AM, Peter Smith wrote:
> I have reviewed the latest v18 patch. Below are some more review
> comments and patches.
Peter, thanks for quickly check the new patch. I'm attaching a new patch (v19)
that addresses (a) this new review, (b) Tomas' review and (c) Greg's review. I
also included the copy/equal node support for the new node (PublicationTable)
mentioned by Tomas in another email.

> 1. Commit comment - wording
8<

> =>
>
> I think this means to say: "Rows that don't satisfy an optional WHERE
> clause will be filtered out."
Agreed.

> 2. Commit comment - wording
>
> "The row filter is per table, which allows different row filters to be
> defined for different tables."
>
> =>
>
> I think all that is the same as just saying: "The row filter is per table."
Agreed.

> 3. PG docs - independent improvement
>
> You wrote (ref [1] point 3):
>
> "I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
> root partitioned table is used. We should improve that sentence too."
>
> I agree, but that PG docs improvement is independent of your RowFilter
> patch; please make another thread for that idea.
I will. And I will also include the next item that I removed from the patch.

> 4. doc/src/sgml/ref/create_publication.sgml - independent improvement
>
> @@ -131,9 +135,9 @@ CREATE PUBLICATION <replaceable
> class="parameter">name</replaceable>
> on its partitions) contained in the publication will be published
> using the identity and schema of the partitioned table rather than
> that of the individual partitions that are actually changed; the
> - latter is the default. Enabling this allows the changes to be
> - replicated into a non-partitioned table or a partitioned table
> - consisting of a different set of partitions.
> + latter is the default (<literal>false</literal>). Enabling this
> + allows the changes to be replicated into a non-partitioned table or a
> + partitioned table consisting of a different set of partitions.
> </para>
>
> I think that Tomas wrote (ref [2] point 2) that this change seems
> unrelated to your RowFilter patch.
>
> I agree; I liked the change, but IMO you need to propose this one in
> another thread too.
Reverted.

> 5. doc/src/sgml/ref/create_subscription.sgml - wording
8<

> I felt that the sentence: "If any table in the publications has a
> <literal>WHERE</literal> clause, data synchronization does not use it
> if the subscriber is a <productname>PostgreSQL</productname> version
> before 15."
>
> Could be expressed more simply like: "If the subscriber is a
> <productname>PostgreSQL</productname> version before 15 then any row
> filtering is ignored."
Agreed.

> 6. src/backend/commands/publicationcmds.c - wrong function comment
8<

> /*
> * Close all relations in the list.
> + *
> + * Publication node can have a different list element, hence, pub_drop_table
> + * indicates if it has a Relation (true) or PublicationTable (false).
> */
> static void
> CloseTableList(List *rels)
>
> =>
>
> The 2nd parameter does not exist in v18, so that comment about
> pub_drop_table seems to be a cut/paste error from the OpenTableList.
Oops. Removed.

> src/backend/replication/logical/tablesync.c - bug ?
>
> @@ -829,16 +883,23 @@ copy_table(Relation rel)
> relmapentry = logicalrep_rel_open(lrel.remoteid, NoLock);
> Assert(rel == relmapentry->localrel);
>
> + /* List of columns for COPY */
> + attnamelist = make_copy_attnamelist(relmapentry);
> +
> /* Start copy on the publisher. */
> =>
>
> I did not understand the above call to make_copy_attnamelist. The
> result seems unused before it is overwritten later in this same
> function (??)
Good catch. This seems to be a leftover from an ancient version.

> 7. src/backend/replication/logical/tablesync.c -
> fetch_remote_table_info enhancement
>
> + /* Get relation qual */
> + if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 150000)
> + {
> + resetStringInfo(&cmd);
> + appendStringInfo(&cmd,
> + "SELECT pg_get_expr(prqual, prrelid) "
> + " FROM pg_publication p "
> + " INNER JOIN pg_publication_rel pr "
> + " ON (p.oid = pr.prpubid) "
> + " WHERE pr.prrelid = %u "
> + " AND p.pubname IN (", lrel->remoteid);
>
> =>
>
> I think a small improvement is possible in this SQL.
>
> If we change that to "SELECT DISTINCT pg_get_expr(prqual, prrelid)"...
> then it avoids the copy SQL from having multiple WHERE clauses which
> are all identical. This could happen when subscribed to multiple
> publications which had the same filter for the same table.
Good catch!

> 8. src/backend/replication/pgoutput/pgoutput.c - qual member is redundant
>
> @@ -99,6 +108,9 @@ typedef struct RelationSyncEntry
>
> bool replicate_valid;
> PublicationActions pubactions;
> + List *qual; /* row filter */
> + List *exprstate; /* ExprState for row filter */
> + TupleTableSlot *scantuple; /* tuple table slot for row filter */
>
> =>
>
> Now that the exprstate is introduced I think that the other member
> "qual" is redundant, so it can be removed.
I was thinking about it for the next patch. Removed.

> 9. src/backend/replication/pgoutput/pgoutput.c - comment typo?
8<

> typo: it/that ?
>
> I think it ought to say "This is the same code as ExecPrepareExpr()
> but that is not used because"...
Fixed.

> 10. src/backend/replication/pgoutput/pgoutput.c - redundant debug logging?
>
> + /* Evaluates row filter */
> + result = pgoutput_row_filter_exec_expr(exprstate, ecxt);
> +
> + elog(DEBUG3, "row filter %smatched", result ? "" : "not ");
>
> The above debug logging is really only a repeat (with different
> wording) of the same information already being logged inside the
> pgoutput_row_filter_exec_expr function isn't it? Consider removing the
> redundant logging.
Agreed. Removed.

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v19-0001-Row-filter-for-logical-replication.patch text/x-patch 73.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2021-07-13 20:39:27 Re: row filtering for logical replication
Previous Message Tom Lane 2021-07-13 20:21:40 Re: Reducing memory consumption for pending inval messages