Re: row filtering for logical replication

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-16 07:37:05
Message-ID: CAJcOf-eciNBeRYMVVV1sbhjTQ-SXVnz1tdwkR3B24bAuBMpEZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 16, 2021 at 3:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> I am not so sure about different filters for old and new rows but it
> makes sense to by default apply the filter to both old and new rows.
> Then also provide a way for user to specify if the filter can be
> specified to just old or new row.
>

I'm having some doubts and concerns about what is being suggested.

My current thought and opinion is that the row filter should
(initially, or at least by default) specify the condition of the row
data at the publication boundary (i.e. what is actually sent to and
received by the subscriber). That means for UPDATE, I think that the
filter should operate on the new value.
This has the clear advantage of knowing (from the WHERE expression)
what restrictions are placed on the data that is actually published
and what subscribers will actually receive. So it's more predictable.
If we filter on OLD rows, then we would need to know exactly what is
updated by the UPDATE in order to know what is actually published (for
example, the UPDATE could modify the columns being checked in the
publication WHERE expression).
I'm not saying that's wrong, or a bad idea, but it's more complicated
and potentially confusing. Maybe there could be an option for it.
Also, even if we allowed OLD/NEW to be specified in the WHERE
expression, OLD wouldn't make sense for INSERT and NEW wouldn't make
sense for DELETE, so one WHERE expression with OLD/NEW references
wouldn't seem valid to cover all operations INSERT, UPDATE and DELETE.
I think that was what Dilip was essentially referring to, with his
suggestion of using different filters for different operations (though
I think that may be going too far for the initial implementation).

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2021-07-16 08:01:55 Re: CREATE COLLATION - check for duplicate options and error out if found one
Previous Message Kyotaro Horiguchi 2021-07-16 07:31:30 Re: 回复: Why is XLOG_FPI_FOR_HINT always need backups?