Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Rahila Syed <rahilasyed90(at)gmail(dot)com>
Subject: Re: row filtering for logical replication
Date: 2021-08-03 10:55:44
Message-ID: CAA4eK1JLQqNZypOpN7h3=Vt0JJW4Yb_FsLJS=T8J9J-WXgFMYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 27, 2021 at 9:56 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Jul 27, 2021 at 6:21 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> > 1) UPDATE a nonkey column in publisher.
> > 2) Use debugger to block the walsender process in function
> > pgoutput_row_filter_exec_expr().
> > 3) Open another psql to connect the publisher, and drop the table which updated
> > in 1).
> > 4) Unblock the debugger in 2), and then I can see the following error:
> > ---
> > ERROR: could not read block 0 in file "base/13675/16391"
>
> Yeah, that's a big problem, seems like the expression evaluation
> machinery directly going and detoasting the externally stored data
> using some random snapshot. Ideally, in walsender we can never
> attempt to detoast the data because there is no guarantee that those
> data are preserved. Somehow before going to the expression evaluation
> machinery, I think we will have to deform that tuple and need to do
> something for the externally stored data otherwise it will be very
> difficult to control that inside the expression evaluation.
>

True, I think it would be possible after we fix the issue reported in
another thread [1] where we will log the key values as part of
old_tuple_key for toast tuples even if they are not changed. We can
have a restriction that in the WHERE clause that user can specify only
Key columns for Updates similar to Deletes. Then, we have the data
required for filter columns basically if the toasted key values are
changed, then they will be anyway part of the old and new tuple and if
they are not changed then they will be part of the old tuple. I have
not checked the implementation part of it but theoretically, it seems
possible. If my understanding is correct then it becomes necessary to
solve the other bug [1] to solve this part of the problem for this
patch. The other possibility is to disallow columns (datatypes) that
can lead to toasted data (at least for Updates) which doesn't sound
like a good idea to me. Do you have any other ideas for this problem?

[1] - https://www.postgresql.org/message-id/OS0PR01MB611342D0A92D4F4BF26C0F47FB229%40OS0PR01MB6113.jpnprd01.prod.outlook.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2021-08-03 11:12:01 Re: Extra code in commit_ts.h
Previous Message vignesh C 2021-08-03 10:54:38 Re: Skipping logical replication transactions on subscriber side