Re: row filtering for logical replication

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-03-18 10:51:55
Message-ID: CAH2L28sS8=jY6Lj2-xVu6exxQ9gK8uGZr9jf64zDw9hacMrkyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Euler,

Please find below some review comments,

1.
+
+ <row>
+ <entry><structfield>prqual</structfield></entry>
+ <entry><type>pg_node_tree</type></entry>
+ <entry></entry>
+ <entry>Expression tree (in <function>nodeToString()</function>
+ representation) for the relation's qualifying condition</entry>
+ </row>
I think the docs are being incorrectly updated to add a column to
pg_partitioned_table
instead of pg_publication_rel.

2. +typedef struct PublicationRelationQual
+{
+ Oid relid;
+ Relation relation;
+ Node *whereClause;
+} PublicationRelationQual;

Can this be given a more generic name like PublicationRelationInfo, so that
the same struct
can be used to store additional relation information in future, for ex.
column names, if column filtering is introduced.

3. Also, in the above structure, it seems that we can do with storing just
relid and derive relation information from it
using table_open when needed. Am I missing something?

4. Currently in logical replication, I noticed that an UPDATE is being
applied on the subscriber even if the column values
are unchanged. Can row-filtering feature be used to change it such that,
when all the OLD.columns = NEW.columns, filter out
the row from being sent to the subscriber. I understand this would need
REPLICA IDENTITY FULL to work, but would be an
improvement from the existing state.

On subscriber:

postgres=# select xmin, * from tab_rowfilter_1;
xmin | a | b
------+---+-------------
555 | 1 | unfiltered
(1 row)

On publisher:
postgres=# ALTER TABLE tab_rowfilter_1 REPLICA IDENTITY FULL;
ALTER TABLE
postgres=# update tab_rowfilter_1 SET b = 'unfiltered' where a = 1;
UPDATE 1

On Subscriber: The xmin has changed indicating the update from the
publisher was applied
even though nothing changed.

postgres=# select xmin, * from tab_rowfilter_1;
xmin | a | b
------+---+-------------
556 | 1 | unfiltered
(1 row)

5. Currently, any existing rows that were not replicated, when updated to
match the publication quals
using UPDATE tab SET pub_qual_column = 'not_filtered' where a = 1; won't be
applied, as row
does not exist on the subscriber. It would be good if ALTER SUBSCRIBER
REFRESH PUBLICATION
would help fetch such existing rows from publishers that match the qual
now(either because the row changed
or the qual changed)

Thank you,
Rahila Syed

On Tue, Mar 9, 2021 at 8:35 PM Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:

> Hi Euler,
>
> Please find some comments below:
>
> 1. If the where clause contains non-replica identity columns, the delete
> performed on a replicated row
> using DELETE from pub_tab where repl_ident_col = n;
> is not being replicated, as logical replication does not have any info
> whether the column has
> to be filtered or not.
> Shouldn't a warning be thrown in this case to notify the user that the
> delete is not replicated.
>
> 2. Same for update, even if I update a row to match the quals on
> publisher, it is still not being replicated to
> the subscriber. (if the quals contain non-replica identity columns). I
> think for UPDATE at least, the new value
> of the non-replicate identity column is available which can be used to
> filter and replicate the update.
>
> 3. 0001.patch,
> Why is the name of the existing ExclusionWhereClause node being changed,
> if the exact same definition is being used?
>
> For 0002.patch,
> 4. +
> + memset(lrel, 0, sizeof(LogicalRepRelation));
>
> Is this needed, apart from the above, patch does not use or update lrel at
> all in that function.
>
> 5. PublicationRelationQual and PublicationTable have similar fields, can
> PublicationTable
> be used in place of PublicationRelationQual instead of defining a new
> struct?
>
> Thank you,
> Rahila Syed
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-03-18 11:05:11 Re: fdatasync performance problem with large number of DB files
Previous Message Sergei Kornilov 2021-03-18 10:44:40 Re: hint Consider using pg_file_read()