Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-02 07:18:14
Message-ID: CAHut+PtJnnM8MYQDf7xCyFAp13U_0Ya2dv-UQeFD=ghixFLZiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 30, 2021 at 3:56 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Nov 29, 2021 at 8:40 PM Euler Taveira <euler(at)eulerto(dot)com> wrote:
> >
> > On Mon, Nov 29, 2021, at 7:11 AM, Amit Kapila wrote:
> >
> > I don't think it is a good idea to combine the row-filter from the
> > publication that publishes just 'insert' with the row-filter that
> > publishes 'updates'. We shouldn't apply the 'insert' filter for
> > 'update' and similarly for publication operations. We can combine the
> > filters when the published operations are the same. So, this means
> > that we might need to cache multiple row-filters but I think that is
> > better than having another restriction that publish operation 'insert'
> > should also honor RI columns restriction.
> >
> > That's exactly what I meant to say but apparently I didn't explain in details.
> > If a subscriber has multiple publications and a table is part of these
> > publications with different row filters, it should check the publication action
> > *before* including it in the row filter list. It means that an UPDATE operation
> > cannot apply a row filter that is part of a publication that has only INSERT as
> > an action. Having said that we cannot always combine multiple row filter
> > expressions into one. Instead, it should cache individual row filter expression
> > and apply the OR during the row filter execution (as I did in the initial
> > patches before this caching stuff). The other idea is to have multiple caches
> > for each action. The main disadvantage of this approach is to create 4x
> > entries.
> >
> > I'm experimenting the first approach that stores multiple row filters and its
> > publication action right now.
> >
>
> We can try that way but I think we should still be able to combine in
> many cases like where all the operations are specified for
> publications having the table or maybe pubactions are same. So, we
> should not give up on those cases. We can do this new logic only when
> we find that pubactions are different and probably store them as
> independent expressions and corresponding pubactions for it at the
> current location in the v42* patch (in pgoutput_row_filter). It is
> okay to combine them at a later stage during execution when we can't
> do it at the time of forming cache entry.
>

PSA a new v44* patch set.

It includes a new patch 0006 which implements the idea above.

ExprState cache logic is basically all the same as before (including
all the OR combining), but there are now 4x ExprState caches keyed and
separated by the 4x different pubactions.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment Content-Type Size
v44-0004-Tab-auto-complete-and-pgdump-support-for-Row-Fil.patch application/octet-stream 5.6 KB
v44-0003-Support-updates-based-on-old-and-new-tuple-in-ro.patch application/octet-stream 19.4 KB
v44-0002-PS-Row-filter-validation-walker.patch application/octet-stream 35.5 KB
v44-0005-cache-the-result-of-row-filter-column-validation.patch application/octet-stream 25.5 KB
v44-0001-Row-filter-for-logical-replication.patch application/octet-stream 85.4 KB
v44-0006-Cache-ExprState-per-pubaction.patch application/octet-stream 14.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2021-12-02 07:20:48 Replace uses of deprecated Python module distutils.sysconfig
Previous Message Jeevan Ladhe 2021-12-02 06:58:08 Re: [PATCH] improve the pg_upgrade error message