RE: Data is copied twice when specifying both child and parent table in publication

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Subject: RE: Data is copied twice when specifying both child and parent table in publication
Date: 2022-11-16 08:58:31
Message-ID: OS3PR01MB6275C62FF1D1CDE4FCBF61339E079@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2022 at 0:56 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > Attach new patches.
>

Thanks for your comments.

> Here we are having tables list to store the relids and table_infos
> list which stores pubid along with relid. Here tables list acts as a
> temporary list to get filter_partitions and then delete the
> published_rel from table_infos. Will it be possible to directly
> operate on table_infos list and remove the temporary tables list used.
> We might have to implement comparator, deduplication functions and
> change filter_partitions function to work directly on published_rel
> type list.
> + /
> + * Record the published table and the
> corresponding publication so
> + * that we can get row filters and column list later.
> + *
> + * When a table is published by multiple
> publications, to obtain
> + * all row filters and column list, the
> structure related to this
> + * table will be recorded multiple times.
> + */
> + foreach(lc, pub_elem_tables)
> + {
> + published_rel *table_info =
> (published_rel *) malloc(sizeof(published_rel));
> +
> + table_info->relid = lfirst_oid(lc);
> + table_info->pubid = pub_elem->oid;
> + table_infos = lappend(table_infos, table_info);
> + }
> +
> + tables = list_concat(tables, pub_elem_tables);
>
> Thoughts?

I think we could only deduplicate published tables per publication to get all
row filters and column lists for each published table later.
I removed the temporary list 'tables' and modified the API of the function
filter_partitions to handle published_rel type list.

Attach the new patch set.

Regards,
Wang wei

Attachment Content-Type Size
HEAD_v15-0001-Fix-data-replicated-twice-when-specifying-publis.patch application/octet-stream 22.5 KB
HEAD_v15-0002-Add-clarification-for-the-behaviour-of-the-publi.patch application/octet-stream 2.4 KB
REL14_v15-0001-Fix-data-replicated-twice-when-specifying-publis_patch application/octet-stream 5.3 KB
REL15_v15-0001-Fix-data-replicated-twice-when-specifying-publis_patch application/octet-stream 8.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2022-11-16 09:03:58 Re: Reducing power consumption on idle servers
Previous Message Simon Riggs 2022-11-16 08:58:01 Re: when the startup process doesn't (logging startup delays)