Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-09-25 05:22:47
Message-ID: CAA4eK1LExOLLAM9eNLedfpLsmWR_0piRtAvSG7bUJjbL5UeHog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 25, 2021 at 3:07 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 9/24/21 8:09 AM, Amit Kapila wrote:
> > On Thu, Sep 23, 2021 at 6:03 PM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>
> >> 13) turning update into insert
> >>
> >> I agree with Ajin Cherian [4] that looking at just old or new row for
> >> updates is not the right solution, because each option will "break" the
> >> replica in some case. So I think the goal "keeping the replica in sync"
> >> is the right perspective, and converting the update to insert/delete if
> >> needed seems appropriate.
> >>
> >> This seems a somewhat similar to what pglogical does, because that may
> >> also convert updates (although only to inserts, IIRC) when handling
> >> replication conflicts. The difference is pglogical does all this on the
> >> subscriber, while this makes the decision on the publisher.
> >>
> >> I wonder if this might have some negative consequences, or whether
> >> "moving" this to downstream would be useful for other purposes in the
> >> fuure (e.g. it might be reused for handling other conflicts).
> >>
> >
> > Apart from additional traffic, I am not sure how will we handle all
> > the conditions on subscribers, say if the new row doesn't match, how
> > will subscribers know about this unless we pass row_filter or some
> > additional information along with tuple. Previously, I have done some
> > research and shared in one of the emails above that IBM's InfoSphere
> > Data Replication [1] performs filtering in this way which also
> > suggests that we won't be off here.
> >
>
> I'm certainly not suggesting what we're doing is wrong. Given the design
> of built-in logical replication it makes sense doing it this way, I was
> just thinking aloud about what we might want to do in the future (e.g.
> pglogical uses this to deal with conflicts between multiple sources, and
> so on).
>

Fair enough.

> >>
> >>
> >> 15) pgoutput_row_filter initializing filter
> >>
> >> I'm not sure I understand why the filter initialization gets moved from
> >> get_rel_sync_entry. Presumably, most of what the replication does is
> >> replicating rows, so I see little point in not initializing this along
> >> with the rest of the rel_sync_entry.
> >>
> >
> > Sorry, IIRC, this has been suggested by me and I thought it was best
> > to do any expensive computation the first time it is required. I have
> > shared few cases like in [2] where it would lead to additional cost
> > without any gain. Unless I am missing something, I don't see any
> > downside of doing it in a delayed fashion.
> >
>
> Not sure, but the arguments presented there seem a bit wonky ...
>
> Yes, the work would be wasted if we discard the cached data without
> using it (it might happen for truncate, I'm not sure). But how likely is
> it that such operations happen *in isolation*? I'd bet the workload is
> almost never just a stream of truncates - there are always some
> operations in between that would actually use this.
>

It could also happen with a mix of truncate and other operations as we
decide whether to publish an operation or not after
get_rel_sync_entry.

> Similarly for the errors - IIRC hitting an error means the replication
> restarts, which is orders of magnitude more expensive than anything we
> can save by this delayed evaluation.
>
> I'd keep it simple, for the sake of simplicity of the whole patch.
>

The current version proposed by Peter is not reviewed yet and by
looking at it I have some questions too which I'll clarify in a
separate email. I am not sure if you are against delaying the
expression initialization because of the current code or concept as a
general because if it is later then we have other instances as well
when we don't do all the work in get_rel_sync_entry like building
tuple conversion map which is cached as well.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-25 05:29:46 Re: row filtering for logical replication
Previous Message Noah Misch 2021-09-25 05:21:51 Re: Fixing WAL instability in various TAP tests