Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>
Cc: Ajin Cherian <itsajin(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-12-13 23:13:52
Message-ID: CAHut+PvtA-QFwwS3yahMrYyKqtjr4ZbcALFbuq4p42fA7GVOaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 7, 2021 at 5:48 PM tanghy(dot)fnst(at)fujitsu(dot)com
<tanghy(dot)fnst(at)fujitsu(dot)com> wrote:
>
...
> Thanks for looking into it.
>
> I have another problem with your patch. The document says:
>
> ... If the subscription has several publications in
> + which the same table has been published with different filters, those
> + expressions get OR'ed together so that rows satisfying any of the expressions
> + will be replicated. Notice this means if one of the publications has no filter
> + at all then all other filters become redundant.
>
> Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
> ALL TABLES IN SCHEMA'.
>
> For example:
> create table tbl (a int primary key);"
> create publication p1 for table tbl where (a > 10);
> create publication p2 for all tables;
> create subscription sub connection 'dbname=postgres port=5432' publication p1, p2;
>
> I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
> treated as no filter, and table tbl should have no filter in subscription sub. Thoughts?
>
> But for now, the filter(a > 10) works both when copying initial data and later changes.
>
> To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
> publication or published as part of schema in function pgoutput_row_filter_init
> (which was introduced in v44-0003 patch), also we need to make some changes in
> tablesync.c.
>

Partly fixed in v46-0005 [1]

NOTE
- The initial COPY part of the tablesync does not take the publish
operation into account so it means that if any of the subscribed
publications have "puballtables" flag then all data will be copied
sans filters. I guess this is consistent with the other decision to
ignore publication operations [2].

TODO
- Documentation
- IIUC there is a similar case yet to be addressed - FOR ALL TABLES IN SCHEMA

------
[1] https://www.postgresql.org/message-id/CAHut%2BPtoxjo6hpDFTya6WYH-zdspKQ5j%2BwZHBRc6EZkAkq7Nfw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1L3r%2BURSLFotOT5Y88ffscCskRoGC15H3CSAU1jj_0Rdg%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-12-13 23:14:52 Re: Adding CI to our tree
Previous Message Bossart, Nathan 2021-12-13 23:05:46 Re: O(n) tasks cause lengthy startups and checkpoints