Re: Add pg_get_publication_ddl function

From: "Jonathan Gonzalez V(dot)" <jonathan(dot)abdiel(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add pg_get_publication_ddl function
Date: 2026-06-09 11:15:38
Message-ID: 8733yw55id.fsf@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Smith <smithpb2250(at)gmail(dot)com> writes:

Hi!

> ======
> doc/src/sgml/func/func-info.sgml
>
> (9.28.13. Get Object DDL Functions)
>
> 1.
> + <para role="func_signature">
> + <function>pg_get_publication_ddl</function>
> + ( <parameter>publication</parameter> <type>text</type>
> + <optional>, <literal>VARIADIC</literal> <parameter>options</parameter>
> + <type>text</type> </optional> )
> + <returnvalue>setof text</returnvalue>
> + </para>
>
> I think "pubname" might be a more meaningful name for the first parameter.

This will make sense, but since the already pushed DDL functions are
using the direct name like `database` or `tablespace` directly, make
sense to me to follow the same pattern, probably we can ask more people
what they think about this?

> ~~~
>
> 2.
> + <para>
> + Reconstructs the <command>CREATE PUBLICATION</command> statement for
> + the specified publication (by OID or name), followed by an
> + <command>ALTER PUBLICATION ... OWNER TO</command> statement (the
> + <command>CREATE PUBLICATION</command> grammar has no
> + <literal>OWNER</literal> clause). Each statement is returned as a
> + separate row. An error is raised if no publication with the supplied
> + OID or name exists. When the publication was created with
> + <literal>FOR ALL TABLES, ALL SEQUENCES</literal>, the emitted
> + statement always lists <literal>ALL TABLES</literal> before
> + <literal>ALL SEQUENCES</literal> regardless of the original order.
> + The following options are supported:
> + <literal>pretty</literal> (boolean) for formatted output and
> + <literal>owner</literal> (boolean) to include
> + <literal>OWNER</literal>.
> + </para></entry>
>
> 2a.
> That "CREATE PUBLICATION" should <link> back to the CREATE PUBLICATION
> docs page.

This is indeed a good idea, I would love to see this also in the other
patchs, probably another patch to update all the functions will be good.
Applied for the next version

> ~
>
> 2b.
> It is overkill to mention about the potential reordering of ALL TABLES
> and ALL SEQUENCES.
>
> Apart from being unnecessary, there are many other things can also be
> rearranged which are not mentioned:
> - TABLES and ALL TABLES IN SCHEMA clauses might be different order
> than specified
> - The publication parameters might be in a different order than specified
> - The values of 'publish' parameter might be different order than specified
> - etc.

Agree, removed.

> ~~~
>
> GENERAL
>
> 3.
> It would be better if the the rows of "Table 9.96" were in alphabetical order.

I think that this should be done in a different patch when all or a big
part of the functions are merged.

> ======
> src/backend/utils/adt/ddlutils.c
>
> pg_get_publication_ddl_internal:
>
> 4.
> + if (pub->allsequences)
> + appendStringInfo(buf,
> + "%sALL SEQUENCES",
> + pub->alltables ? ", " : "");
>
> Maybe better to avoid tricky format strings.
>
> SUGGESTION
> if (pub->allsequences)
> {
> if (pub->alltables)
> appendStringInfo(buf, ", ");
>
> appendStringInfo("ALL SEQUENCES");
> }

I don't have a strong opinion on this being "tricky" but it's being used
in many places already, specially in the pg_get_*_ddl functions, but
it's clear where the comma should be if the condition is true, with your
suggestion it's a bit harder to read I think. I would like to have more
opinions on this

> ~~~
>
> 5.
> + if (pub_incl_relids != NIL)
> + {
> + ListCell *pub_cell;
> + char *schemaname = NULL;
> + char *tablename;
> +
> + append_ddl_option(buf, pretty, 4, "FOR TABLE ");
> +
> + /*
> + * Publication can have table relations
> + */
> + foreach(pub_cell, pub_incl_relids)
>
> Maybe that comment belongs earlier (above the if).

Yes! Indeed, it was a leftover from the previous refactor, thank you!

> ~~~
>
> 6.
> + appendStringInfo(buf, "%s%s",
> + foreach_current_index(pub_cell) > 0 ? ", " : "",
> + quote_qualified_identifier(schemaname, tablename));
>
> Another place where avoiding a tricky format string may be tidier.
>
> SUGGESTION
> if (foreach_current_index(pub_cell) > 0)
> appendStringInfo(buf, ", ");
>
> appendStringInfo(buf, "%s", quote_qualified_identifier(schemaname, tablename));

Same as above

> ~~~
>
> 7.
> + pubtuple = SearchSysCache2(PUBLICATIONRELMAP, ObjectIdGetDatum(relid),
> + ObjectIdGetDatum(pub->oid));
> +
> + if (!HeapTupleIsValid(pubtuple))
> + elog(ERROR,
> + "cache lookup failed for publication relation %u in publication %u",
> + relid, pub->oid);
>
> 7a.
> Maybe blank line here is not wanted.

It gives some space, it was intentional.

> ~
>
> 7b.
> Don't need to say "publication" 2x.
>
> /publication relation/relation/

Indeed.

> ~~~
>
> 10.
> + /*
> + * If there is a condition it goes after the columns. We can have
> + * conditions without columns as well.
> + */
> + if (!condition_nulls)
>
> 10a.
> The earlier assignment to 'conditions' should be moved to be directly
> above here.

In this case I think it's better to have both calls to SysCacheGetAttr()
together since both depends on the previous call of SearchSysCache2()

> ~
>
> 10b.
> BTW, it is called a "row filter" so maybe it is better to refer to
> that in the comments/vars instead of the generic sounding "condition".

Changed!

> ~~~
>
> 11.
> + /* If we have schemas, they will go right before the WITH */
>
> The kind of comments that just say "this-goes-after-that" or
> "this-goes-after-that" are not very useful, because it is obvious from
> the code logic that some appendStringInfo comes before or after
> another one.

I don't agree on this one, since you're building a DDL, while reading
the code these message helped me a lot of keep in mind what's go first
and after, the comment born as a helper for the order of the strings.

> ~~~
>
> 12.
> + /*
> + * Schemas can be preceded by a list of tables. When they are, the
> + * "TABLES IN SCHEMA" stays inline as a continuation of the existing
> + * FOR clause; otherwise it starts the FOR clause on its own line in
> + * pretty mode.
> + */
>
> IMO it would be better for the FOR TABLE IN SCHEMA to come *before*
> the specific tables in FOR TABLE.
>
> e.g. For the case when there are specified tables "absorbed" into the
> same named schemas I think it is more natural to see the schemas
> first.
> CREATE PUBLICATION mypub FOR TABLES IN SCHEMA s, TABLE s.t1;

This decision was made based on the documentation[1] and the order comes
from the appearance there. I don't have an strong opinion on the order,
but I notice something, in the documentation it says:
where publication_object is one of:
So in theory it should be FOR TABLE or FOR TABLES IN SCHEMA, so now I
have a confusion, because having both it is actually possible.

> ~~~
>
> 14.
> + if (pub_excl_relids != NIL)
> + {
> + ListCell *excl_cell;
> + char *schemaname = NULL;
> +
> + appendStringInfoString(buf, " EXCEPT (TABLE ");
>
> The EXCEPT clause is currently permitted only with FOR ALL TABLES, so
> it would be better moving this to earlier in this function where
> pub->alltables was handled.

Probably this will make sense and also will require to refactor the
entire code, but there's discussions[2][3] about extending that already.
Attaching all the EXCEPT clause only to the FOR ALL TABLES looks to me
that it will create just more work for the future.

> 16.
> + /*
> + * We need to know if we're the second permission added to prefix with a
> + * ", " string
> + */
> + if (pub->pubactions.pubinsert)
> + {
> + /*
> + * By precedence we know that the insert will always be first, no need
> + * to check previous values
> + */
> + appendStringInfoString(buf, "insert");
>
> Both these comments are doing little more than just saying the same as
> the code. IMO they are not needed.

Agree, removed the first one.

> ~~~
>
> 17.
> + if (pub->pubactions.pubinsert)
> + {
> + /*
> + * By precedence we know that the insert will always be first, no need
> + * to check previous values
> + */
> + appendStringInfoString(buf, "insert");
> + first_perm = false;
> + }
> +
> + if (pub->pubactions.pubupdate)
> + {
> + appendStringInfo(buf, "%supdate", first_perm ? "" : ", ");
> + first_perm = false;
> + }
> + if (pub->pubactions.pubdelete)
> + {
> + appendStringInfo(buf, "%sdelete", first_perm ? "" : ", ");
> + first_perm = false;
> + }
> +
> + if (pub->pubactions.pubtruncate)
> + {
> + appendStringInfo(buf, "%struncate", first_perm ? "" : ", ");
> + }
> +
>
> 17a.
> There are some random blank lines that seem unnecessary.

True!

> ~
>
> 17b.
> IMO it is tidier to simply append the string you want, instead of
> using a trick format string.
>
> SUGGESTION (compare with patch)
>
> if (pub->pubactions.pubinsert)
> {
> appendStringInfoString(buf, "insert");
> first_perm = false;
> }
> if (pub->pubactions.pubupdate)
> {
> appendStringInfo(buf, first_perm ? "update" : ",update");
> first_perm = false;
> }
> if (pub->pubactions.pubdelete)
> {
> appendStringInfo(buf, first_perm ? "delete" : ",delete");
> first_perm = false;
> }
> if (pub->pubactions.pubtruncate)
> {
> appendStringInfo(buf, first_perm ? "truncate" : ",truncate");
> }

I really prefer not to repeat a word twice in the same line just to add
a `,` at the beginning.

> ~~~
>
> 20.
> The generated boolean values (e.g. 'true'/'false') do not need to be quoted.

Changed

> ======
> src/test/regress/sql/publication_ddl.sql
>
> (Here are lots of test review comments; the first group are are
> general so might apply to multiple test cases).
>
> 21.
> I think you can create all the necessary schema and tables together
> up-front instead of scatering them through the file.

Probably, but I prefer to create and use, rather than create and expect
to use it or if in the future a test is removed should also remove the
created tables and schemas.

> ~~~
>
> 22.
> Make use of the proper publication teminology like "Column Lists" and
> "Row Filters" instead of vague
> "columns" and "conditions".

Agree on use row filters, changed.

> ~~~
>
> 23.
> Here is an idea:
>
> Instead of having dozens of test publications, just have 1 test
> publication, which you CREATE/DROP for each test case.
>
> Then, since there is a fixed name publication (e.g. "mypub") for
> everything, you can make a subroutine to encapsulate the common code:
>
> +SELECT pg_get_publication_ddl('mypub');
> +SELECT pg_get_publication_ddl((SELECT oid FROM pg_publication WHERE
> pubname='mypub'));
> +SELECT pg_get_publication_ddl('mypub', 'pretty', 'true');
>
> It means your test .sql file can become much shorter/simpler I think.

I tried that at the beginning, but after having many similar tests and
lot's of them, finding the one with error of "mypub" was really hard, so
using a name plus a number make it easy to find which one was failing
rather than look for the same name multiple times

> ~~~
>
> 24.
> There is duplication of some tests:
>
> e.g.
> +-- columns in publication must be quoted
> and
> +-- identifiers that require quoting: publication, schema, table and column

Indeed the first one should be removed since the second one will cover
more cases.

> ~~~
>
> 25.
> It is not needed to quote the booleans 'true'/'false' for the options.
>
> //////

Can you provide an example? didn't understood this one.

> 26.
> +-- create base table to test basic table publication
>
> What does "basic table publication" mean? I expect it means different
> things to different people. Better to be explicit about what this is
> really testing.

True, changed.

> ~~~
>
> 27.
> +-- create publication for one table with two columns and a condition
> with an expression
>
> What does "with an expression" mean? All Row-Filters are expressions
> aren't they?

Changed.

> ~~~
>
> 28.
> +-- create a publication for a list of tables
>
> Not really describing what this test is doing, which is mixing FOR
> TABLE and FOR TABLES IN SCHEMA.

Changed

> ~~~
>
> 30.
> +-- create publication for all tables except two tables
>
> Actually this is also combining with an ALL SEQUENCES test.
Changed

> ~~~
>
> 32.
> +-- cleanup tables in schemas
>
> Not sure why this is done separately. Probably easier just to drop the
> schemas with CASCADE so their tables will be auto-deleted.

Separately to not create tables that aren't used, but sure a CASCADE now
can be useful since I don't think more tests will be added.

[1] https://www.postgresql.org/docs/devel/sql-createpublication.html
[2] https://www.postgresql.org/message-id/flat/CANhcyEVSXyQkvmrsOWPdQqnm2J3GMyQQrKhyCJiBQzqs6AvSow%40mail.gmail.com
[3] https://www.postgresql.org/message-id/flat/CABdArM5sw4Q1ZU8HGdo4BSc1A_%2B8xtUNq17j6wcir%3DyMUy19Cg%40mail.gmail.com

--
Jonathan Gonzalez V.
EDB
https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2026-06-09 11:29:26 Re: postgres_fdw: Emit message when batch_size is reduced
Previous Message Peter Eisentraut 2026-06-09 11:06:10 Re: (SQL/PGQ) cache lookup failed for label