Re: Skipping schema changes in publication

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, YeXiu <1518981153(at)qq(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Skipping schema changes in publication
Date: 2025-07-21 19:35:40
Message-ID: CANhcyEXYrzFrGqzw+=qjuC0RFMDaoX+b1SZV4yCJNG6KWxodPQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 21 Jul 2025 at 16:22, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Sat, Jul 19, 2025 at 4:17 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Mon, 30 Jun 2025 at 16:25, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > Few more comments on 002:
> > >
> > > 5)
> > > +GetAllTablesPublicationRelations(Oid pubid, bool pubviaroot)
> > > {
> > >
> > > + List *exceptlist;
> > > +
> > > + exceptlist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL);
> > >
> > >
> > > a) Here, we are assuming that the list provided by
> > > GetPublicationRelations() will be except-tables list only, but there
> > > is no validation of that.
> > > b) We are using GetPublicationRelations() to get the relations which
> > > are excluded from the publication. The name of function and comments
> > > atop function are not in alignment with this usage.
> > >
> > > Suggestion:
> > > We can have a new GetPublicationExcludeRelations() function for the
> > > concerned usage. The existing logic of GetPublicationRelations() can
> > > be shifted to a new internal-logic function which will accept a
> > > 'except-flag' as well. Both GetPublicationRelations() and
> > > GetPublicationExcludeRelations() can call that new function by passing
> > > 'except-flag' as false and true respectively. The new internal
> > > function will validate 'prexcept' against that except-flag passed and
> > > will return the results.
> > >
> > I have made the above change.
> >
> >
> > > 6)
> > > Before your patch002, GetTopMostAncestorInPublication() was checking
> > > pg_publication_rel and pg_publication_namespace to find out if the
> > > table in the ancestor-list is part of a given particular. Both
> > > pg_publication_rel and pg_publication_namespace did not have the entry
> > > "for all tables" publications. That means
> > > GetTopMostAncestorInPublication() was originally not checking whether
> > > the given puboid is an "for all tables" publication to see if a rel
> > > belongs to that particular pub or not. I
> > >
> > > But now with the current change, we do check if pub is all-tables pub,
> > > if so, return relid and mark ancestor_level (provided table is not
> > > part of the except list). IIUC, the result in 2 cases may be
> > > different. Is that the intention? Let me know if my understanding is
> > > wrong.
> > >
> > This is intentional, in function get_rel_sync_entry, we are setting
> > pub_relid to the topmost published ancestor. In HEAD we are directly
> > setting using:
> > /*
> > * If this is a FOR ALL TABLES publication, pick the partition
> > * root and set the ancestor level accordingly.
> > */
> > if (pub->alltables)
> > {
> > publish = true;
> > if (pub->pubviaroot && am_partition)
> > {
> > List *ancestors = get_partition_ancestors(relid);
> >
> > pub_relid = llast_oid(ancestors);
> > ancestor_level = list_length(ancestors);
> > }
> > }
> > In HEAD, we can directly use 'llast_oid(ancestors)' to get the topmost
> > ancestor for case of FOR ALL TABLES.
> > But with this proposal. This change will no longer be valid as the
> > 'llast_oid(ancestors)' may be excluded in the publication. So, to
> > handle this change was made in GetTopMostAncestorInPublication.
> >
> >
> > Also, during testing with the partitioned table and
> > publish_via_partition_root the behaviour of the current patch is as
> > below:
> > For example we have a partitioned table t1. It has partitions part1
> > and part2. Now consider the following cases:
> > 1. with publish_via_partition_root = true
> > I. If we create publication on all tables with EXCEPT t1, no data
> > for t1, part1 or part2 is replicated.
> > II. If we create publication on all tables with EXCEPT part1,
> > data for all tables t1, part1 and part2 is replicated.
> > 2. with publish_via_partition_root = false
> > I. If we create publication on all tables with EXCEPT t1, no data
> > for t1, part1 or part2 is replicated.
> > II. If we create publication on all tables with EXCEPT part1,
> > data for part1 is not replicated
> >
> > Is this behaviour fine?
> > I checked for other databases such as MySQL, SQL Server. In that we do
> > not have such cases as either we replicate the whole partitioned table
> > or we not replicated at all. We do not have partition level control.
> > For Oracle, I found that we can include or exclude partitions using
> > 'PARTITIONEXCLUDE' [2], but did not find something similar to
> > publish_via_partition_root or where partitions are published as
> > separate tables.
> > What are your thoughts on the above behaviour?
> >
>
> Thank You for the details. I will review this behaviour soon and will
> let you know my comments. Meanwhile, please find a few comments on
> v16-0001:
>
> 1)
> we do LockSchemaList() everywhere before we call
> PublicationDropSchemas() to prevent concurrent schema deletion. Do we
> need that in reset flow as well?
Added

>
> 2)
> + /* Drop the schemas associated with the publication */
> + schemas = GetPublicationSchemas(pubid);
> + PublicationDropSchemas(pubid, schemas, true);
> +
> + /* Get all relations associated with the publication */
> + relids = GetPublicationRelations(pubid, PUBLICATION_PART_ROOT);
>
> We can rename schemas to schemaids similar to relids, as
> GetPublicationSchemas return oids.
>
Fixed

> 3)
> + /* Drop the relations associated with the publication */
> + PublicationDropTables(pubform->oid, rels, true);
>
> we can pass 'pubid' here instead of pubform->oid
>
Modified

> 4)
> Shall we modify the comments:
> 'Drop the relations associated with the publication' to 'Remove the
> associated relations from the publication'
> 'Drop the schemas associated with the publication' to 'Remove the
> associated schemas from the publication'
>
> Similar changes can be done in test file's comments as well
> --Verify that tables associated with the publication are dropped after
> RESET
> --Verify that schemas associated with the publication are dropped after RESET
>
Fixed

I have made the changes in the latest v17 patch [1].
[1]: https://www.postgresql.org/message-id/CANhcyEUtYV-9ujtxLasnxN_peT%2B3LuZjcRx1xUECh1CCmANB8w%40mail.gmail.com

Thanks,
Shlok Kyal

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-07-21 19:41:49 Re: Parallel heap vacuum
Previous Message Shlok Kyal 2025-07-21 19:33:37 Re: Skipping schema changes in publication