Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-09-20 12:06:57
Message-ID: CAA4eK1KNPWVuV-bwH8J8LtZWzh+PpQ+Rz0Lab2mpL=BA97hi3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 20, 2021 at 3:17 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> On Wed, Sep 8, 2021 at 7:59 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > On Wed, Sep 1, 2021 at 9:23 PM Euler Taveira <euler(at)eulerto(dot)com> wrote:
> > >
>
> > Somehow this approach of either new_tuple or old_tuple doesn't seem to
> > be very fruitful if the user requires that his replica is up-to-date
> > based on the filter condition. For that, I think you will need to
> > convert UPDATES to either INSERTS or DELETES if only new_tuple or
> > old_tuple matches the filter condition but not both matches the filter
> > condition.
> >
> > UPDATE
> > old-row (match) new-row (no match) -> DELETE
> > old-row (no match) new row (match) -> INSERT
> > old-row (match) new row (match) -> UPDATE
> > old-row (no match) new-row (no match) -> (drop change)
> >
>
> Adding a patch that strives to do the logic that I described above.
> For updates, the row filter is applied on both old_tuple
> and new_tuple. This patch assumes that the row filter only uses
> columns that are part of the REPLICA IDENTITY. (the current patch-set
> only
> restricts this for row-filters that are delete only)
> The old_tuple only has columns that are part of the old_tuple and have
> been changed, which is a problem while applying the row-filter. Since
> unchanged REPLICA IDENTITY columns
> are not present in the old_tuple, this patch creates a temporary
> old_tuple by getting such column values from the new_tuple and then
> applies the filter on this hand-created temp old_tuple. The way the
> old_tuple is created can be better optimised in future versions.
>

Yeah, this is the kind of idea which can work. One thing you might
want to check is the overhead of the additional deform/form cycle. You
might want to use Peter's tests above. I think you need to only form
old/new tuples when you have changed something in it but on a quick
look, it seems you are always re-forming both the tuples.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-09-20 12:10:45 Re: proposal: possibility to read dumped table's name from file
Previous Message Amit Kapila 2021-09-20 11:51:00 Re: Logical replication timeout problem