Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Euler Taveira <euler(at)eulerto(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-09-01 12:36:00
Message-ID: CAA4eK1L1juKb8mofeBerrfQ09N2QhUMXog4FFaL2vG_2Kt8kXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 1, 2021 at 4:53 PM Euler Taveira <euler(at)eulerto(dot)com> wrote:
>
> On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
>
> Here are the new v26* patches. This is a refactoring of the row-filter
> caches to remove all the logic from the get_rel_sync_entry function
> and delay it until if/when needed in the pgoutput_row_filter function.
> This is now implemented per Amit's suggestion to move all the cache
> code [1]. It is a replacement for the v25* patches.
>
> The make check and TAP subscription tests are all OK. I have repeated
> the performance tests [2] and those results are good too.
>
> v26-0001 <--- v23 (base RF patch)
> v26-0002 <--- ExprState cache mods (refactored row filter caching)
> v26-0002 <--- ExprState cache extra debug logging (temp)
>
> Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
> as soon as I integrate it as part of this recent modification.
>
> I'm attaching a new version that simply including Houzj review [1]. This is
> based on v23.
>
> There has been a discussion about which row should be used by row filter. We
> don't have a unanimous choice, so I think it is prudent to provide a way for
> the user to change it. I suggested in a previous email [2] that a publication
> option should be added. Hence, row filter can be applied to old tuple, new
> tuple, or both. This approach is simpler than using OLD/NEW references (less
> code and avoid validation such as NEW reference for DELETEs and OLD reference
> for INSERTs). I think about a reasonable default value and it seems _new_ tuple
> is a good one because (i) it is always available and (ii) user doesn't have
> to figure out that replication is broken due to a column that is not part
> of replica identity.
>

I think this or any other similar solution for row filters (on
updates) won't work till we solve the problem reported by Hou-San [1].
The main reason is that we don't have data for unchanged toast columns
in WAL. For that, we have discussed some probable solutions in email
[2], however, that also required us to solve one of the existing
bugs[3].

[1] - https://www.postgresql.org/message-id/OS0PR01MB571618736E7E79309A723BBE94E99%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] - https://www.postgresql.org/message-id/CAA4eK1JLQqNZypOpN7h3%3DVt0JJW4Yb_FsLJS%3DT8J9J-WXgFMYg%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-09-01 12:58:33 Re: support for MERGE
Previous Message Sehrope Sarkuni 2021-09-01 12:33:54 Re: Add jsonlog log_destination for JSON server logs