Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-16 03:26:53
Message-ID: CAA4eK1LB7-THsVzB7h-w9gSf7Cym3SZ++6s5CKNL-OfNfQmARQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > Is there some reasonable rule which of the old/new tuples (or both) to
> > use for the WHERE condition? Or maybe it'd be handy to allow referencing
> > OLD/NEW as in triggers?
>
> I think for insert we are only allowing those rows to replicate which
> are matching filter conditions, so if we updating any row then also we
> should maintain that sanity right? That means at least on the NEW rows
> we should apply the filter, IMHO. Said that, now if there is any row
> inserted which were satisfying the filter and replicated, if we update
> it with the new value which is not satisfying the filter then it will
> not be replicated, I think that makes sense because if an insert is
> not sending any row to a replica which is not satisfying the filter
> then why update has to do that, right?
>

There is another theory in this regard which is what if the old row
(created by the previous insert) is not sent to the subscriber as that
didn't match the filter but after the update, we decide to send it
because the updated row (new row) matches the filter condition. In
this case, I think it will generate an update conflict on the
subscriber as the old row won't be present. As of now, we just skip
the update but in the future, we might have some conflict handling
there. If this is true then even if the new row matches the filter,
there is no guarantee that it will be applied on the subscriber-side
unless the old row also matches the filter. Sure, there could be a
case where the user might have changed the filter between insert and
update but maybe we can have a separate way to deal with such cases if
required like providing some provision where the user can specify
whether it would like to match old/new row in updates?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-07-16 03:44:49 Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Previous Message Peter Smith 2021-07-16 03:21:54 Re: Corrected documentation of data type for the logical replication message formats.