Re: row filtering for logical replication

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Rahila Syed" <rahilasyed90(at)gmail(dot)com>
Cc: Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "David Steele" <david(at)pgmasters(dot)net>, "Craig Ringer" <craig(at)2ndquadrant(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Amit Langote" <amitlangote09(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-03-22 01:58:14
Message-ID: e59c95d4-2029-437c-bd6a-b513ef2ae794@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 9, 2021, at 12:05 PM, Rahila Syed wrote:
> Please find some comments below:
Thanks for your review.

> 1. If the where clause contains non-replica identity columns, the delete performed on a replicated row
> using DELETE from pub_tab where repl_ident_col = n;
> is not being replicated, as logical replication does not have any info whether the column has
> to be filtered or not.
> Shouldn't a warning be thrown in this case to notify the user that the delete is not replicated.
Isn't documentation enough? If you add a WARNING, it should be printed per row,
hence, a huge DELETE will flood the client with WARNING messages by default. If
you are thinking about LOG messages, it is a different story. However, we
should limit those messages to one per transaction. Even if we add such an aid,
it would impose a performance penalty while checking the DELETE is not
replicating because the row filter contains a column that is not part of the PK
or REPLICA IDENTITY. If I were to add any message, it would be to warn at the
creation time (CREATE PUBLICATION or ALTER PUBLICATION ... [ADD|SET] TABLE).

> 2. Same for update, even if I update a row to match the quals on publisher, it is still not being replicated to
> the subscriber. (if the quals contain non-replica identity columns). I think for UPDATE at least, the new value
> of the non-replicate identity column is available which can be used to filter and replicate the update.
Indeed, the row filter for UPDATE uses the new tuple. Maybe your non-replica
identity column contains NULL that evaluates the expression to false.

> 3. 0001.patch,
> Why is the name of the existing ExclusionWhereClause node being changed, if the exact same definition is being used?
Because this node ExclusionWhereClause is used for exclusion constraint. This
patch renames the node to made it clear it is a generic node that could be used
for other filtering features in the future.

> For 0002.patch,
> 4. +
> + memset(lrel, 0, sizeof(LogicalRepRelation));
>
> Is this needed, apart from the above, patch does not use or update lrel at all in that function.
Good catch. It is a leftover from a previous patch. It will be fixed in the
next patch set.

> 5. PublicationRelationQual and PublicationTable have similar fields, can PublicationTable
> be used in place of PublicationRelationQual instead of defining a new struct?
I don't think it is a good idea to have additional fields in a parse node. The
DDL commands use Relation (PublicationTableQual) and parse code uses RangeVar
(PublicationTable). publicationcmds.c uses Relation everywhere so I decided to
create a new struct to store Relation and qual as a list item. It also minimizes the places
you have to modify.

--
Euler Taveira
EDB https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2021-03-22 02:15:10 Re: row filtering for logical replication
Previous Message Thomas Munro 2021-03-22 01:53:43 Re: Why logical replication lancher exits 1?