Quick Links

Re: row filtering for logical replication

From:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc:	Euler Taveira <euler(at)eulerto(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: row filtering for logical replication
Date:	2021-07-16 08:45:59
Message-ID:	cf6101dd-01d5-0ae2-0b32-ee2fc31b0ea5@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 7/16/21 5:26 AM, Amit Kapila wrote:
> On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>>
>> On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>
>>> Is there some reasonable rule which of the old/new tuples (or both) to
>>> use for the WHERE condition? Or maybe it'd be handy to allow referencing
>>> OLD/NEW as in triggers?
>>
>> I think for insert we are only allowing those rows to replicate which
>> are matching filter conditions, so if we updating any row then also we
>> should maintain that sanity right? That means at least on the NEW rows
>> we should apply the filter, IMHO. Said that, now if there is any row
>> inserted which were satisfying the filter and replicated, if we update
>> it with the new value which is not satisfying the filter then it will
>> not be replicated, I think that makes sense because if an insert is
>> not sending any row to a replica which is not satisfying the filter
>> then why update has to do that, right?
>>
>
> There is another theory in this regard which is what if the old row
> (created by the previous insert) is not sent to the subscriber as that
> didn't match the filter but after the update, we decide to send it
> because the updated row (new row) matches the filter condition. In
> this case, I think it will generate an update conflict on the
> subscriber as the old row won't be present. As of now, we just skip
> the update but in the future, we might have some conflict handling
> there.

Right.

> If this is true then even if the new row matches the filter,
> there is no guarantee that it will be applied on the subscriber-side
> unless the old row also matches the filter. Sure, there could be a > case where the user might have changed the filter between insert and
> update but maybe we can have a separate way to deal with such cases if
> required like providing some provision where the user can specify
> whether it would like to match old/new row in updates?
>

I think the best we can do for now is to document this. AFAICS it can't
be solved without a conflict resolution that would turn the UPDATE to
INSERT. And that would require REPLICA IDENTITY FULL, otherwise the
UPDATE would not have data for all the columns.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: row filtering for logical replication at 2021-07-16 03:26:53 from Amit Kapila

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Laurenz Albe	2021-07-16 09:10:33	Re: [PATCH] psql: \dn+ to show size of each schema (and \dA+ for AMs)
Previous Message	Japin Li	2021-07-16 08:42:01	Re: Why ALTER SUBSCRIPTION ... SET (slot_name='none') requires subscription disabled?