RE: Data is copied twice when specifying both child and parent table in publication

From: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Data is copied twice when specifying both child and parent table in publication
Date: 2022-04-19 08:53:09
Message-ID: OSZPR01MB63109BE4F5521C542A75AC9BFDF29@OSZPR01MB6310.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 19, 2022 3:05 PM houzj(dot)fnst(at)fujitsu(dot)com <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> > -----Original Message-----
> > From: Wang, Wei/王 威 <wangw(dot)fnst(at)fujitsu(dot)com>
> On Thursday, April 7, 2022 11:08 AM
> >
> > On Thur, Mar 10, 2021 at 10:08 AM houzj(dot)fnst(at)fujitsu(dot)com wrote:
> > > Hi,
> > >
> > > When reviewing some logical replication related features. I noticed another
> > > possible problem if the subscriber subscribes multiple publications which
> > > publish parent and child table.
> > >
> > > For example:
> > >
> > > ----pub
> > > create table t (a int, b int, c int) partition by range (a);
> > > create table t_1 partition of t for values from (1) to (10);
> > >
> > > create publication pub1 for table t
> > > with (PUBLISH_VIA_PARTITION_ROOT);
> > > create publication pub2 for table t_1
> > > with (PUBLISH_VIA_PARTITION_ROOT);
> > >
> > > ----sub
> > > ---- prepare table t and t_1
> > > CREATE SUBSCRIPTION sub CONNECTION 'port=10000 dbname=postgres'
> > > PUBLICATION pub1, pub2;
> > >
> > > select * from pg_subscription_rel ;
> > > srsubid | srrelid | srsubstate | srsublsn
> > > ---------+---------+------------+-----------
> > > 16391 | 16385(t) | r | 0/150D100
> > > 16391 | 16388(t_1) | r | 0/150D138
> > >
> > > If subscribe two publications one of them publish parent table with
> > > (pubviaroot=true) and another publish child table. Both the parent table and
> > > child table will exist in pg_subscription_rel which also means we will do
> > > initial copy for both tables.
> > >
> > > But after initial copy, we only publish change with the schema of the parent
> > > table(t). It looks a bit inconsistent.
> > >
> > > Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think
> > the
> > > expected behavior could be we only store the top most parent(table t) in
> > > pg_subscription_rel and do initial copy for it if pubviaroot is on. I haven't
> > > thought about how to fix this and will investigate this later.
> > Hi,
> > I try to fix this bug. Attach the patch.
> >
> > The current HEAD get table list for one publication by invoking function
> > pg_get_publication_tables. If multiple publications are subscribed, then this
> > function is invoked multiple times. So option PUBLISH_VIA_PARTITION_ROOT
> > works
> > independently on every publication, I think it does not work correctly on
> > different publications of the same subscription.
> >
> > So I fix this bug by the following two steps:
> > First step,
> > I get oids of subscribed tables by publication list. Then for tables with the
> > same topmost root table, I filter them base on the option
> > PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
> > After filtering, I get the final oid list.
> > Second step,
> > I get the required informations(nspname and relname) base on the oid list of
> > first step.
>
> Thanks for updating the patch.
> I confirmed that the bug is fixed by this patch.
>
> One suggestion is that can we simplify the code by moving the logic of checking
> the ancestor into the SQL ?. For example, we could filter the outpout of
> pg_publication_tables by adding A WHERE clause which checks whether the table
> is a partition and if its ancestor is also in the output. I think we can also
> filter the needless partition in this approach.
>

I agreed with you and I tried to fix this problem in a simpler way. What we want
is to exclude the partitioned table whose ancestor is also need to be
replicated, so how about implementing that by using the following SQL when
getting the table list from publisher?

SELECT DISTINCT ns.nspname, c.relname
FROM pg_catalog.pg_publication_tables t
JOIN pg_catalog.pg_namespace ns ON ns.nspname = t.schemaname
JOIN pg_catalog.pg_class c ON c.relname = t.tablename AND c.relnamespace = ns.oid
WHERE t.pubname IN ('p0','p2')
AND (c.relispartition IS FALSE OR NOT EXISTS (SELECT 1 FROM pg_partition_ancestors(c.oid)
WHERE relid IN ( SELECT DISTINCT (schemaname||'.'||tablename)::regclass::oid
FROM pg_catalog.pg_publication_tables t
WHERE t.pubname IN ('p0','p2') ) AND relid != c.oid));

Please find the attached patch which used this approach, I also merged the test
in Wang's patch into it.

Regards,
Shi yu

Attachment Content-Type Size
v2-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch application/octet-stream 5.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2022-04-19 09:01:33 Re: Defer selection of asynchronous subplans until the executor initialization stage
Previous Message Thomas Munro 2022-04-19 08:31:05 Re: Crash in new pgstats code