Re: Data is copied twice when specifying both child and parent table in publication

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2021-10-19 05:37:46
Message-ID: CAFiTN-uG8R0vEuzhOgco2u=2dkZ+dr9ReR8yTnmJWtaQ_NU4Bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 19, 2021 at 8:17 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:

> Thanks for the explanation.
>
> I think one reason that I consider this behavior a bug is that: If we add
> both the root partitioned table and the leaf partition explicitly to the
> publication (and set publish_via_partition_root = on), the behavior of the
> apply worker is inconsistent with the behavior of table sync worker.
>
> In this case, all changes in the leaf the partition will be applied using the
> identity and schema of the partitioned(root) table. But for the table sync, it
> will execute table sync for both the leaf and the root table which cause
> duplication of data.
>
> Wouldn't it be better to make the behavior consistent here ?

I agree with the point, whether we are doing the initial sync or we
are doing transaction streaming the behavior should be the same. I
think the right behavior should be that even if user has given both
parent table and the child table in the published table list, it
should sync it only once, because consider the case where we add a
same table twice e.g (CREATE PUBLICATION mypub FOR TABLE t1,t1;) but
in that case also we consider this table only once and there will be
no duplicate data.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-10-19 05:42:09 Re: Reset snapshot export state on the transaction abort
Previous Message Michael Paquier 2021-10-19 05:27:05 Fixing build of MSVC with OpenSSL 3.0.0