Re: Data is copied twice when specifying both child and parent table in publication

From: Jacob Champion <jchampion(at)timescale(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2023-03-17 23:36:53
Message-ID: eb7047d0-6db1-b526-f44a-fad58eda35d4@timescale.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 16, 2023 at 11:28 PM wangw(dot)fnst(at)fujitsu(dot)com
<wangw(dot)fnst(at)fujitsu(dot)com> wrote:
> Attach the new patch set.

Hi,

I ran into this problem while hacking on [1], so thank you for tackling
it! I have no strong opinions on the implementation itself; I just want
to register a concern that the tests have not kept up with the
implementation complexity.

For example, the corner case mentioned in 0003, with multiple
publications having conflicting pubviaroot settings, isn't tested as far
as I can see. (I checked manually, and it appears to work as intended.)
And the related pub_lower_level test currently only covers the case
where multiple publications have pubviaroot=true, so the following test
comment is now misleading:

> # for tab4, we publish changes through the "middle" partitioned table
> $node_publisher->safe_psql('postgres',
> "CREATE PUBLICATION pub_lower_level FOR TABLE tab4_1 WITH (publish_via_partition_root = true)"
> );

...since the changes are now in fact published via the tab4 root after
this patchset is applied.

> I think the aim of joining it with pg_publication before is to exclude
> non-existing publications. Otherwise, we would get an error because of the call
> to function GetPublicationByName (with 'missing_ok = false') in function
> pg_get_publication_tables.

In the same vein, I don't think that case is covered anywhere.

There are a bunch of moving parts and hidden subtleties here, and I fell
into a few traps when I was working on my patch, so it'd be nice to have
additional coverage. I'm happy to contribute effort in that area if it's
helpful.

Thanks!
--Jacob

[1] https://postgr.es/m/dc57f088-039b-7a71-8f4c-082ef106246e%40timescale.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-03-17 23:36:58 Re: Add pg_walinspect function with block info columns
Previous Message Magnus Hagander 2023-03-17 23:17:46 Re: Should we remove vacuum_defer_cleanup_age?