RE: Data is copied twice when specifying both child and parent table in publication

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: RE: Data is copied twice when specifying both child and parent table in publication
Date: 2022-03-10 02:17:32
Message-ID: OS0PR01MB5716DC2982CC735FDE388804940B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When reviewing some logical replication related features. I noticed another
possible problem if the subscriber subscribes multiple publications which
publish parent and child table.

For example:

----pub
create table t (a int, b int, c int) partition by range (a);
create table t_1 partition of t for values from (1) to (10);

create publication pub1 for table t
with (PUBLISH_VIA_PARTITION_ROOT);
create publication pub2 for table t_1
with (PUBLISH_VIA_PARTITION_ROOT);

----sub
---- prepare table t and t_1
CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres' PUBLICATION pub1, pub2;

select * from pg_subscription_rel ;
srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
16391 | 16385(t) | r | 0/150D100
16391 | 16388(t_1) | r | 0/150D138

If subscribe two publications one of them publish parent table with
(pubviaroot=true) and another publish child table. Both the parent table and
child table will exist in pg_subscription_rel which also means we will do
initial copy for both tables.

But after initial copy, we only publish change with the schema of the parent
table(t). It looks a bit inconsistent.

Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think the
expected behavior could be we only store the top most parent(table t) in
pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
thought about how to fix this and will investigate this later.

Best regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-03-10 02:43:16 Re: Adding CI to our tree
Previous Message Imseih (AWS), Sami 2022-03-10 01:37:58 Re: Add index scan progress to pg_stat_progress_vacuum