Re: row filtering for logical replication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Ajin Cherian <itsajin(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-09-21 04:23:44
Message-ID: CAFiTN-uGr4jG2LTOz_nU2tKoPvgQSHoScMTsFaKVxbFeNAkvYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 21, 2021 at 8:58 AM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> > I understand why this is done, but I have 2 concerns here 1) We are
> > having extra deform and copying the field from new to old in case it
> > is unchanged replica identity. 2) The same unchanged attribute values
> > get qualified in the old tuple as well as in the new tuple. What
> > exactly needs to be done is that the only updated field should be
> > validated as part of the old as well as the new tuple, the unchanged
> > field does not make sense to have redundant validation. For that we
> > will have to change the filter for the old tuple to just validate the
> > attributes which are actually modified and remaining unchanged and new
> > values will anyway get validated in the new tuple.
> >
> But what if the filter expression depends on multiple columns, say (a+b) > 100
> where a is unchanged while b is changed. Then we will still need both
> columns for applying

In such a case, we need to.

> the filter even though one is unchanged. Also, I am not aware of any
> mechanism by which
> we can apply a filter expression on individual attributes. The current
> mechanism does it
> on a tuple. Do let me know if you have any ideas there?

What I suggested is to modify the filter for the old tuple, e.g.
filter is (a > 10 and b < 20 and c+d = 20), now only if a and c are
modified then we can process the expression and we can transform this
filter to (a > 10 and c+d=20).

>
> Even if it were done, there would still be the overhead of deforming the tuple.

Suppose filter is just (a > 10 and b < 20) and only if the a is
updated, and if we are able to modify the filter for the oldtuple to
be just (a>10) then also do we need to deform? Even if we have to we
can save a lot on avoiding duplicate expression evaluation.

> I will run some performance tests like Amit suggested and see what the
> overhead is and
> try to minimise it.

It is good to know, I think you must try with some worst-case
scenarios, e.g. we have 10 text column and 1 int column in the REPLICA
IDENTITY and only the int column get updated and all the text column
are not updated, and you have a filter on all the columns.

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-09-21 04:53:01 Re: Skipping logical replication transactions on subscriber side
Previous Message Justin Pryzby 2021-09-21 04:09:08 Re: PostgreSQL 14 press release draft