Re: Data is copied twice when specifying both child and parent table in publication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2023-03-16 12:25:22
Message-ID: CAA4eK1LCzvPLVHHXn+VcGmFHApKtbpaNF1UFed3qE8=GhPzPGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 8, 2023 at 9:21 AM wangw(dot)fnst(at)fujitsu(dot)com
<wangw(dot)fnst(at)fujitsu(dot)com> wrote:
>
> I think this failure is caused by the recently commit (b7ae039) in the current
> HEAD. Rebased the patch set and attach them.
>

+ if (server_version >= 160000)
+ {
+ appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
+ " ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
+ " FROM pg_attribute a\n"
+ " WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
+ " NOT a.attisdropped AND\n"
+ " (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
+ " ) AS attnames\n"
+ " FROM pg_class C\n"
+ " JOIN pg_namespace N ON N.oid = C.relnamespace\n"
+ " JOIN ( SELECT (pg_get_publication_tables(VARIADIC
array_agg(pubname::text))).*\n"
+ " FROM pg_publication\n"
+ " WHERE pubname IN ( %s )) as GPT\n"
+ " ON GPT.relid = C.oid\n",
+ pub_names.data);

The function pg_get_publication_tables() has already handled dropped
columns, so we don't need it here in this query. Also, the part to
build attnames should be the same as it is in view
pg_publication_tables. Can we directly try to pass the list of
pubnames to the function pg_get_publication_tables() instead of
joining it with pg_publication?

Can we keep the changes in the else part (fix when publisher < 16) the
same as HEAD and move the proposed change to a separate patch?
Basically, for the HEAD patch, let's just try to fix this when
publisher >=16. I am slightly worried that as this is a corner case
bug and we didn't see any user complaints for this, so introducing a
complex fix for back branches may not be required or at least we can
discuss that separately.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anton A. Melnikov 2023-03-16 12:39:00 Re: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.
Previous Message Pavel Stehule 2023-03-16 12:05:41 Re: proposal: possibility to read dumped table's name from file