Re: row filtering for logical replication

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-16 08:45:59
Message-ID: cf6101dd-01d5-0ae2-0b32-ee2fc31b0ea5@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/16/21 5:26 AM, Amit Kapila wrote:
> On Wed, Jul 14, 2021 at 4:30 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>>
>> On Wed, Jul 14, 2021 at 3:58 PM Tomas Vondra
>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>
>>> Is there some reasonable rule which of the old/new tuples (or both) to
>>> use for the WHERE condition? Or maybe it'd be handy to allow referencing
>>> OLD/NEW as in triggers?
>>
>> I think for insert we are only allowing those rows to replicate which
>> are matching filter conditions, so if we updating any row then also we
>> should maintain that sanity right? That means at least on the NEW rows
>> we should apply the filter, IMHO. Said that, now if there is any row
>> inserted which were satisfying the filter and replicated, if we update
>> it with the new value which is not satisfying the filter then it will
>> not be replicated, I think that makes sense because if an insert is
>> not sending any row to a replica which is not satisfying the filter
>> then why update has to do that, right?
>>
>
> There is another theory in this regard which is what if the old row
> (created by the previous insert) is not sent to the subscriber as that
> didn't match the filter but after the update, we decide to send it
> because the updated row (new row) matches the filter condition. In
> this case, I think it will generate an update conflict on the
> subscriber as the old row won't be present. As of now, we just skip
> the update but in the future, we might have some conflict handling
> there.

Right.

> If this is true then even if the new row matches the filter,
> there is no guarantee that it will be applied on the subscriber-side
> unless the old row also matches the filter. Sure, there could be a > case where the user might have changed the filter between insert and
> update but maybe we can have a separate way to deal with such cases if
> required like providing some provision where the user can specify
> whether it would like to match old/new row in updates?
>

I think the best we can do for now is to document this. AFAICS it can't
be solved without a conflict resolution that would turn the UPDATE to
INSERT. And that would require REPLICA IDENTITY FULL, otherwise the
UPDATE would not have data for all the columns.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2021-07-16 09:10:33 Re: [PATCH] psql: \dn+ to show size of each schema (and \dA+ for AMs)
Previous Message Japin Li 2021-07-16 08:42:01 Re: Why ALTER SUBSCRIPTION ... SET (slot_name='none') requires subscription disabled?