Re: row filtering for logical replication

From: Ajin Cherian <itsajin(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-10-20 02:58:56
Message-ID: CAFPTHDbJUNWtjrixqZGF03orpuWqVnFe3DKxbjVd0BB-xnVpBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 15, 2021 at 3:30 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Wed, Oct 13, 2021 at 10:00 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
> >
> > I have made the change to use the virtual slot for expression
> > evaluation and avoided tuple deformation.
> >
>
> I started looking at the v32-0006 patch and have some initial comments.
> Shouldn't old_slot, new_slot and tmp_new_slot be cached in the
> RelationSyncEntry, similar to scantuple?
> Currently, these slots are always getting newly allocated each call to
> pgoutput_row_filter_update() - and also, seemingly never deallocated.
> We previously found that allocating slots each time for each row
> filtered (over 1000s of rows) had a huge performance overhead.
> As an example, scantuple was originally newly allocated each row
> filtered, and to filter 1,000,000 rows in a test case it was taking 40
> seconds. Caching the allocation in RelationSyncEntry reduced it down
> to about 5 seconds.

Thanks for the comment, I have modified patch 6 to cache old_tuple,
new_tuple and tmp_new_tuple.

On Tue, Oct 12, 2021 at 1:37 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> + if ((att->attlen == -1 &&
> VARATT_IS_EXTERNAL_ONDISK(tmp_new_slot->tts_values[i])) &&
> + (!old_slot->tts_isnull[i] &&
> + !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]))))
> + {
> + tmp_new_slot->tts_values[i] = old_slot->tts_values[i];
> + newtup_changed = true;
> + }
>
> If the attribute is stored EXTERNAL_ONDIS on the new tuple and it is
> not null in the old tuple then it must be logged completely in the old
> tuple, so instead of checking
> !(VARATT_IS_EXTERNAL_ONDISK(old_slot->tts_values[i]), it should be
> asserted,

Sorry, I missed this in my last update
For this to be true, shouldn't the fix in [1] be committed? I will
change this once that change is committed.

[1] - https://www.postgresql.org/message-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com

regards,
Ajin Cherian
Fujitsu Australia

Attachment Content-Type Size
v33-0001-Row-filter-for-logical-replication.patch application/octet-stream 70.5 KB
v33-0005-PS-POC-Row-filter-validation-walker.patch application/octet-stream 11.9 KB
v33-0002-PS-Add-tab-auto-complete-support-for-the-Row-Fil.patch application/octet-stream 2.2 KB
v33-0003-PS-ExprState-cache-modifications.patch application/octet-stream 11.4 KB
v33-0004-PS-Row-filter-validation-of-replica-identity.patch application/octet-stream 20.3 KB
v33-0006-Support-updates-based-on-old-and-new-tuple-in-ro.patch application/octet-stream 21.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-10-20 03:02:55 RE: Skipping logical replication transactions on subscriber side
Previous Message Andres Freund 2021-10-20 02:41:56 Re: [RFC] building postgres with meson