Re: row filtering for logical replication

From: "Euler Taveira" <euler(at)eulerto(dot)com>
To: "Peter Smith" <smithpb2250(at)gmail(dot)com>, "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Rahila Syed" <rahilasyed90(at)gmail(dot)com>, "Peter Eisentraut" <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "David Steele" <david(at)pgmasters(dot)net>, "Craig Ringer" <craig(at)2ndquadrant(dot)com>, "Tomas Vondra" <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Amit Langote" <amitlangote09(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-09-01 11:21:34
Message-ID: 0c2464d4-65f4-4d91-aeb2-c5584c1350f5@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
> Here are the new v26* patches. This is a refactoring of the row-filter
> caches to remove all the logic from the get_rel_sync_entry function
> and delay it until if/when needed in the pgoutput_row_filter function.
> This is now implemented per Amit's suggestion to move all the cache
> code [1]. It is a replacement for the v25* patches.
>
> The make check and TAP subscription tests are all OK. I have repeated
> the performance tests [2] and those results are good too.
>
> v26-0001 <--- v23 (base RF patch)
> v26-0002 <--- ExprState cache mods (refactored row filter caching)
> v26-0002 <--- ExprState cache extra debug logging (temp)
Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.

I'm attaching a new version that simply including Houzj review [1]. This is
based on v23.

There has been a discussion about which row should be used by row filter. We
don't have a unanimous choice, so I think it is prudent to provide a way for
the user to change it. I suggested in a previous email [2] that a publication
option should be added. Hence, row filter can be applied to old tuple, new
tuple, or both. This approach is simpler than using OLD/NEW references (less
code and avoid validation such as NEW reference for DELETEs and OLD reference
for INSERTs). I think about a reasonable default value and it seems _new_ tuple
is a good one because (i) it is always available and (ii) user doesn't have
to figure out that replication is broken due to a column that is not part
of replica identity. I'm attaching a POC that implements it. I'm still
polishing it. Add tests for multiple row filters and integrate Peter's caching
mechanism [3] are the next steps.

[1] https://www.postgresql.org/message-id/OS0PR01MB571696CA853B3655F7DE752994E29%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] https://www.postgresql.org/message-id/5a3f74df-ffa1-4126-a5d8-dbb081d3e439%40www.fastmail.com
[3] https://www.postgresql.org/message-id/CAHut%2BPsgRHymwLhJ9t3By6%2BKNaVDzfjf6Y4Aq%3DJRD-y8t1mEFg%40mail.gmail.com

--
Euler Taveira
EDB https://www.enterprisedb.com/

Attachment Content-Type Size
v27-0001-Row-filter-for-logical-replication.patch text/x-patch 73.1 KB
v27-0002-publication-parameter-row_filter_on_update.patch text/x-patch 24.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-09-01 11:53:13 Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)
Previous Message Daniel Gustafsson 2021-09-01 11:15:27 Re: postgres_fdw: Handle boolean comparison predicates