Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-08 05:24:47
Message-ID: CAA4eK1+3XbBjAgtVTgE4ky+MGaqAJrku=JVeqajKh5BfrdcJTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 7, 2021 at 6:31 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
>
> On Tue, Dec 7, 2021 at 12:18 PM tanghy(dot)fnst(at)fujitsu(dot)com
> <tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > I have another problem with your patch. The document says:
> >
> > ... If the subscription has several publications in
> > + which the same table has been published with different filters, those
> > + expressions get OR'ed together so that rows satisfying any of the expressions
> > + will be replicated. Notice this means if one of the publications has no filter
> > + at all then all other filters become redundant.
> >
> > Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
> > ALL TABLES IN SCHEMA'.
> >
> > For example:
> > create table tbl (a int primary key);"
> > create publication p1 for table tbl where (a > 10);
> > create publication p2 for all tables;
> > create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;
>
> Thanks for the example. I was wondering about this case myself.
>

I think we should handle this case.

> >
> > I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
> > treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?
> >
> > But for now, the filter(a > 10) works both when copying initial data and later changes.
> >
> > To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
> > publication or published as part of schema in function pgoutput_row_filter_init
> > (which was introduced in v44-0003 patch), also we need to make some changes in
> > tablesync.c.
>
> In order to check "FOR ALL_TABLES", we might need to fetch publication
> metadata.
>

Do we really need to perform a separate fetch for this? In
get_rel_sync_entry(), we already have this information, can't we
someway stash that in the corresponding RelationSyncEntry so that same
can be used later for row filtering.

> Instead of that can we add a "TRUE" filter on all the tables
> which are part of FOR ALL TABLES publication?
>

How? We won't have an entry for such tables in pg_publication_rel
where we store row_filter information.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2021-12-08 05:29:28 Re: Is it correct to update db state in control file as "shutting down" during end-of-recovery checkpoint?
Previous Message Bharath Rupireddy 2021-12-08 05:17:57 Re: add recovery, backup, archive, streaming etc. activity messages to server logs along with ps display