Re: row filtering for logical replication

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2022-01-20 14:26:13
Message-ID: 202201201426.d5rduokb7ufi@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2022-Jan-20, Amit Kapila wrote:

> It returns an invalid column referenced in an RF if any but if not
> then it helps to form pubactions which is anyway required at a later
> point in the caller. The idea is that when we are already traversing
> publications we should store/gather as much info as possible.

I think this design isn't quite awesome.

> I think probably the API name is misleading, maybe we should name it
> something like ValidateAndFetchPubInfo, ValidateAndRememberPubInfo, or
> something along these lines?

Maybe RelationBuildReplicationPublicationDesc or just
RelationBuildPublicationDesc are good names for a routine that fill in
the publication aspect of the relcache entry, as a parallel to
RelationBuildPartitionDesc.

> > Maybe this was meant to be "validate RF
> > expressions" and return, perhaps, a bitmapset of all invalid columns
> > referenced?
>
> Currently, we stop as soon as we find the first invalid column.

That seems quite strange. (And above you say "gather as much info as
possible", so why stop at the first one?)

> > (What is an invalid column in the first place?)
>
> A column that is referenced in the row filter but is not part of
> Replica Identity.

I do wonder how do these invalid columns reach the table definition in
the first place. Shouldn't these be detected at DDL time and prohibited
from getting into the definition?

... so if I do
ADD TABLE foobar WHERE (col_not_in_replident = 42)
then I should get an error immediately, rather than be forced to
construct a relcache entry with "invalid" data in it. Likewise if I
change the replica identity to one that causes one of these to be
invalid. Isn't this the same approach we discussed for column
filtering?

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Voy a acabar con todos los humanos / con los humanos yo acabaré
voy a acabar con todos (bis) / con todos los humanos acabaré ¡acabaré! (Bender)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-01-20 14:27:33 Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)
Previous Message Laurenz Albe 2022-01-20 14:20:52 Re: [PATCH] Add reloption for views to enable RLS