| From: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> | 
|---|---|
| To: | Peter Smith <smithpb2250(at)gmail(dot)com> | 
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, YeXiu <1518981153(at)qq(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: Skipping schema changes in publication | 
| Date: | 2025-06-24 04:17:53 | 
| Message-ID: | CANhcyEXX3viVpYcGHD_fzhf_f6CDQWr2+VBywrJf5zm_XiB4tg@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Fri, 20 Jun 2025 at 09:28, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> ...
> > > 3.
> > > TBH, I was wondering why a new catalog attribute was necessary...
> > >
> > > Can't you simply re-use the existing attribute "prattrs" attribute.
> > > e.g. let's just define negative means exclude.
> > >
> > > e.g. a value of 1 3 means only the 1st and 3rd columns are published
> > > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published
> > > e.g. a value of null mean all columns are published
> > >
> > > (mixes of negative and positive will not be possible)
> > >
> >
> > Currently I have added a new attribute 'prexcludeattrs' in
> > pg_publication_rel table. I used this approach because it will be
> > easier for user to get the exclude column list, in code no extra
> > processing is required to get the exclude column list.
> >
> > For an approach to use negative numbers for exclude columns. I see an
> > advantage that we do not need to introduce a new column for
> > pg_publication_rel. But in code, each time we want to get a column
> > list or exclude column list we need an extra processing of 'prattrs'
> > columns. Also I don't see any existing catalog table using a negative
> > attribute for column list.
> >
> > Based on above observations, I feel that the current is better.
> >
> > Please correct me if I missed an advantage for the approach you suggested.
> >
>
> OK. Maybe using negative numbers was a bridge too far...
>
> But IMO it is not good to have 2 separate attributes for the lists.
> Doing so implies they can coexist, but that is not true. I felt there
> are not really 2 "kinds" of columns list anyway -- there is just a
> "column list" which defines columns that are either included or
> excluded from the publication determined by EXCEPT.
>
> Having  dual lists gets weird/confusing to describe them -- you end up
> continually having to refer to the other one to clarify behaviour.
>
> e.g. Does 'prattrs' value NULL mean publish everything? Well, no...
> that depends if there is a non null 'prexcludeattrs'
> e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well,
> no... that depends if there is a non null 'prattrs'
>
> Furthermore, all the code is doubling up referring to "column list"
> and "exclude column list"  -- code / docs / comments / error messages.
> There are quite a lot of places the patch touches that I thought were
> not really needed if you don't have 2 different kinds of column-lists.
>
> To summarise, I felt it would be better to just keep the existing
> 'prattrs' as the one-and-only column list, but add another BOOLEAN
> attribute to flag whether 'prattrs' columns should be included or
> excluded.
>
> prattrs;   prattrs_exclude;  Means
> --------------------------------------------
> 1 2 3     f                          only cols 1,2,3 will be published
> 4 5 6     t                          only cols 4,5,6 will NOT be published
> null       f                          all cols are published (flag is ignored)
> null       t                          all cols are published (flag is ignored)
>
I agree with your point and also it would be a better approach. In
patch 0001 an column 'prexcept' was added in pg_publication_rel. We
use that only for publication with all tables. I have reused this
column for patch 0003. If publication is not for all tables and the
'prexcept' flag is true, it implies that the columns in 'prattrs' are
to be excluded from being published. I have included the changes for
it in v14-0003 patch.
> > > 5.
> > > +  <para>
> > > +   Alter publication <structname>mypublication</structname> to add table
> > > +   <structname>users</structname> except column
> > > +   <structname>security_pin</structname>:
> > > +<programlisting>
> > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin);
> > >
> > > Those tags don't seem correct. e.g. "users" and "security_pin" are not
> > > <structname> (???).
> > >
> > > Perhaps, every other example here is wrong too and you just copied
> > > them? Anyway, something here looks wrong to me.
> > >
> > I saw different documents and usage of tags seems not well defined.
> > For example for table we are using tags in document
> > create_publication.sgml, update.sgml <structname> is used, in document
> > table.sgml, advanced.sgml <classname> is used, and in
> > logical-replication.sgml <literal>  is used. Similarly for column
> > names <structname>, <structfield> or <literal> are used in different
> > parts of the document.
> >
> > I kept the changed tag to <structfield> for the column for this patch.
> > Do you have any suggestions?
>
> No, for this patch I think it is best that you just follow nearby code
> (as you are already doing). I plan to raise another thread to ask what
> are the guidelines for this  sort of markup which is currently used
> inconsistently in different places.
Thanks for starting a thread for it.
>
> //////////
>
> Below are a few more review comments for v13-0003
>
> ======
> Commit message
>
> 1.
> Typo /THe/The/
>
> ~~~
Fixed
> 2.
> The new syntax allows specifying excluded column list when creating or
> altering a publication. For example:
> CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list)
> or
> ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list)
>
> ~
>
> I felt since you say these "For example:" it would be better to give
> real examples.
> e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)".
>
Fixed
> ~~~
>
> 3.
> Typo /family of command/family of commands/
>
> ======
> doc/src/sgml/logical-replication.sgml
>
> 4.
> I am not sure that it was a good idea to be making a new term called
> an "exclude column list"... because in introduces a new concept of
> something that sounds like it is a different kind of list, and now you
> have to keep referring everywhere to both to "column list" versus
> "exclude column list". All the doubling up add more complication I
> think.
>
> IMO really there is just a "column list". Whether that list is for
> exclusion or not just depends on the presence of EXCEPT. So I felt
> maybe all places mentioning "exclude column list" could be rephrased.
>
> ======
> src/backend/catalog/pg_publication.c
>
> 5.
> +/*
> + * Returns true if the relation has exluded column list associated with the
> + * publication, false otherwise.
> + *
> + * If a exclude column list is found, the corresponding bitmap is returned
> + * through the cols parameter, if provided. The bitmap is constructed
> within the
> + * given memory context (mcxt).
> + */
> +
>
> Typo /exluded column/an excluded column/
> Typo /exclude column list/excluded column list/
>
updated the comment according to latest implementation
> ~~~
>
> 6.
> +/*
> + * pub_exclude_collist_validate
> + * Process and validate the 'excluded columns' list and ensure the columns
> + * are all valid to exclude from publication.  Checks for and raises an
> + * ERROR for any unknown columns, system columns, duplicate columns, or
> + * generated columns.
> + *
>
> Why can't you exclude generated columns?
>
> e.g. Maybe PUBLICATION says publish_generated_columns=stored and there
> are 100s of such columns, but the user just wants to exclude one of
> them. Why say they cannot do that? Hmm. Perhaps this is being already
> handled elsewhere, in which case this comment still seems misleading.
>
I have removed this restriction. Now we can specify stored generated
columns in EXCEPT (column_list) when we use the
'publish_generated_columns' flag.
> ======
> src/backend/commands/publicationcmds.c
>
> 7.
> + * With REPLICA IDENTITY FULL, no column list and no excluded column
> + * list is allowed.
>
> Really, just "no column list is allowed." same as it said before.
>
> ======
Fixed
Thanks and Regards,
Shlok Kyal
| Attachment | Content-Type | Size | 
|---|---|---|
| v14-0001-Add-RESET-clause-to-Alter-Publication-which-will.patch | application/octet-stream | 20.5 KB | 
| v14-0003-Skip-publishing-the-columns-specified-in-FOR-TAB.patch | application/octet-stream | 54.9 KB | 
| v14-0002-Skip-publishing-the-tables-specified-in-EXCEPT-T.patch | application/octet-stream | 68.8 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | shveta malik | 2025-06-24 05:07:00 | Re: Logical Replication of sequences | 
| Previous Message | Michael Paquier | 2025-06-24 04:14:55 | Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx |