| From: | Akshay Joshi <akshay(dot)joshi(at)enterprisedb(dot)com> |
|---|---|
| To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
| Cc: | Euler Taveira <euler(at)eulerto(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Quan Zongliang <quanzongliang(at)yeah(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement |
| Date: | 2026-01-21 09:31:44 |
| Message-ID: | CANxoLDd73eDHykBquqqButBp73mHudyM-3iQnYp6FmPnetZb-g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
In my previous email, I included two different patches (for two separate
approaches) from different branches. As a result, CommitFest is indicating
that a rebase is required.
Apologies for the inconvenience, I’m still getting familiar with the
process.
Attached are the patches, layered one on top of the other, representing two
approaches:
- *Double Dash*:
v8-0001-Add-pg_get_database_ddl-function-to-reconstruct-double-dash.patch
- *DefElem (Key-Value)*:
v8-0002-Add-pg_get_database_ddl-function-to-reconstruct-DefElem.patch
I am now submitting the *v8 patches*, which are ready for review. Please
let me know which approach you find more suitable and preferable.
On Tue, Jan 20, 2026 at 5:06 PM Akshay Joshi <akshay(dot)joshi(at)enterprisedb(dot)com>
wrote:
> All,
>
> Following Alvaro's suggestion to use DefElem syntax, I have modified the
> code to support key-value pairs.
>
> I have attached two different patches :
> v7-0001: Uses the double-dash syntax.
> v7-0002: Uses the DefElem syntax.
>
> *Usage Example for double-dash approach: *
> SELECT pg_get_database_ddl('postgres', '--no-owner', '--no-tablespace');
> -- Omit Owner and Tablespace clauses.
> SELECT pg_get_database_ddl('postgres', 'pretty', '--with-defaults'); --
> Includes clauses for parameters at their default values.
>
> *Usage Example for DefElem approach*: The DefElem implementation supports
> various boolean values (no, false, 0) and is case-insensitive.
> SELECT pg_get_database_ddl('postgres', 'owner=no', 'tablespace=false'); --
> Omits Owner and Tablespace clauses.
> SELECT pg_get_database_ddl('postgres', 'pretty', 'defaults=yes'); --
> Includes clauses for parameters at their default values.
> SELECT pg_get_database_ddl('postgres', 'pretty', 'defaults'); -- Includes
> clauses for parameters at their default values.
>
>
> *Please suggest which approach is preferred. The patches are ready for
> review.*
>
>
> On Fri, Jan 16, 2026 at 12:53 PM Akshay Joshi <
> akshay(dot)joshi(at)enterprisedb(dot)com> wrote:
>
>>
>> -
>>
>> 1) Implement *uint64* for the flags to provide capacity for up to 64
>> distinct options.
>> 2) Refactor parse_ddl_options to return the flag set directly rather
>> than using an *out parameter*.
>>
>> Please find the attached *v7* patch, which is now ready for review.
>>
>>
>> On Wed, Jan 14, 2026 at 9:18 PM Akshay Joshi <
>> akshay(dot)joshi(at)enterprisedb(dot)com> wrote:
>>
>>> I have incorporated Euler’s changes, which modify the declaration and
>>> definition of the *pg_get_database_ddl* function. Please find the
>>> attached v6 patch, which is now ready for review. The following updates
>>> have been made:
>>>
>>> 1.
>>>
>>> Function signature updated to: pg_get_database_ddl(database_id
>>> regdatabase, VARIADIC ddl_options text[])
>>> 2.
>>>
>>> Added options *--no-owner* and *--no-tablespace* to omit the OWNER
>>> and TABLESPACE clauses from the reconstructed DDL.
>>> 3.
>>>
>>> Moved the "*pretty*" parameter into ddl_options for formatted output.
>>> 4.
>>>
>>> Properties such as ENCODING, LOCALE_PROVIDER, TABLESPACE, CONNECTION
>>> LIMIT, and ALLOW_CONNECTIONS are omitted if they are set to default values.
>>> 5.
>>>
>>> Introduced the *--with-defaults* option to include clauses for
>>> parameters even when they are at their default values.
>>> 6.
>>>
>>> Standardized formatting to use spaces instead of tabs.
>>>
>>> *Usage examples:*
>>>
>>> 1. SELECT pg_get_database_ddl('postgres'); // Non pretty-formatted
>>> DDL
>>> 2. SELECT pg_get_database_ddl(16835); // Non pretty-formatted DDL
>>> 3. SELECT pg_get_database_ddl('postgres', 'pretty'); //
>>> pretty-formatted DDL
>>> 4. SELECT pg_get_database_ddl('postgres', '--no-owner',
>>> '--no-tablespace'); // Omits the Owner and Tablespace clause from the DDL.
>>> 5. SELECT pg_get_database_ddl('postgres', 'pretty',
>>> '--with-defaults'); // Includes clauses for parameters that are currently
>>> at their default values.
>>>
>>>
>>> On Thu, Jan 8, 2026 at 4:57 AM Andrew Dunstan <andrew(at)dunslane(dot)net>
>>> wrote:
>>>
>>>>
>>>> On 2025-12-12 Fr 10:19 AM, Euler Taveira wrote:
>>>>
>>>> On Fri, Dec 12, 2025, at 7:52 AM, Akshay Joshi wrote:
>>>>
>>>> On Thu, Dec 11, 2025 at 7:29 PM Euler Taveira <euler(at)eulerto(dot)com> <euler(at)eulerto(dot)com> wrote:
>>>>
>>>> Is there any way to obtain the default values directly from the source
>>>> code itself, or do I need to refer to the documentation? If we rely on
>>>> the documentation and compare against that, then in the future, if the
>>>> default values change, we would also need to update our logic
>>>> accordingly.
>>>>
>>>>
>>>> No, you need to check the documentation. If you are changing the default value,
>>>> you are breaking compatibility; that rarely happens. If we are really concern
>>>> about this fact, you can add a test case that creates the object without
>>>> properties (all default values) and another with all default properties and
>>>> then compare the output.
>>>>
>>>>
>>>> Maybe the function should have a VERBOSE option that emits all the
>>>> defaults.
>>>>
>>>>
>>>> Constantly having to check the documentation for default values may
>>>> feel annoying to some users. Some users run queries with parameters
>>>> such as encoding, connection limit, and locale using their default
>>>> values. When they call the pg_get_database_ddl function, it
>>>> reconstructs the short command based on those defaults.
>>>>
>>>>
>>>> Encoding and locale, ok but I doubt about connection limit.
>>>>
>>>> postgres=# SELECT current_user;
>>>> current_user
>>>> --------------
>>>> euler
>>>> (1 row)
>>>>
>>>> postgres=# CREATE DATABASE foo;
>>>> CREATE DATABASE
>>>> postgres=# CREATE DATABASE bar OWNER euler;
>>>> CREATE DATABASE
>>>>
>>>> When you are learning a new command, you generally don't set the default value
>>>> for a property just to be correct. I'm not saying this function shouldn't
>>>> include OWNER. I'm just suggesting it to be optional. See some arguments
>>>> below.
>>>>
>>>>
>>>> * OWNER. There is no guarantee that the owner exists in the cluster you will
>>>> use this output. That's something that pg_dumpall treats separately (see
>>>> above). Does it mean we should include the owner? No. We can make it an
>>>> option.
>>>>
>>>>
>>>> If I understand correctly, the owner should be an option provided by
>>>> the caller of the function, and we reconstruct the Database DDL using
>>>> that specified owner. Is that right?
>>>> If so, then in my humble opinion, this is not truly a reconstruction
>>>> of the existing database object.
>>>>
>>>>
>>>> No. My idea is to have something like the pg_dump --no-owner option. This is
>>>> important if you are transporting the objects from one cluster to another one.
>>>> Owner might be different. That's why I'm suggesting it should be optional. It
>>>> means flexibility. See pg_dump output format that always apply the OWNER as a
>>>> separate ALTER command.
>>>>
>>>>
>>>> +1
>>>>
>>>>
>>>> * options. Since I mentioned options for some properties (owner, strategy,
>>>> template), these properties can be accommodated as a VARIADIC argument. The
>>>> function signature can be something like
>>>>
>>>> pg_get_database_ddl(oid, VARIADIC options text[])
>>>>
>>>> I would include the pretty print into options too.
>>>>
>>>>
>>>> Same comment as the one I gave for the Owner, if you are referring to
>>>> these as options to the function.
>>>>
>>>>
>>>> Let me elaborate a bit. As I suggested you can control the output with options.
>>>> Why? Flexibility.
>>>>
>>>> Why am I suggesting such a general purpose implementation? See some of the use
>>>> cases.
>>>>
>>>> 1. object DDL. Check DDL to recreate the object. It is not the exact DDL that
>>>> the user informed but it produces the same result.
>>>> 2. clone tool. Clone the objects to recreate the environment for another
>>>> customer. These objects can be created in the same cluster or in another one.
>>>> (Of course, global objects don't apply for the same cluster.)
>>>> 3. dump tool. Dump the commands to recreate the existing objects.
>>>> 4. diff tool. There are tools like pgquarrel [1] that queries the catalog and
>>>> compare the results to create commands to turn the target database into the
>>>> source database. The general purpose functions can be used if the object
>>>> doesn't exist in the target database. (Of course, it doesn't apply for global
>>>> objects but again it is a good UI to have all of these pg_get_OBJECT_ddl
>>>> functions using the same approach.)
>>>> 5. logical replication. These pg_get_OBJECT_ddl functions can be good
>>>> candidates to be used in the initial schema replication and even in the DDL
>>>> replication (if the object doesn't exist in the target database).
>>>>
>>>> The "options" parameter is to get the DDL command to serve any of these use
>>>> cases. There are some properties in a certain object that you *don't* want for
>>>> whatever reason. See some --no-OBJECT options in pg_dump. Let's say you don't
>>>> want the TABLESPACE or the table access method while getting the CREATE TABLE
>>>> DDL because it is different in the other database.
>>>>
>>>>
>>>> +1
>>>>
>>>>
>>>> I received a review comment suggesting the use of tabs. I also looked
>>>> up PostgreSQL best practices on google, which recommend using tabs for
>>>> indentation and spaces for alignment. I’m open to updating my code
>>>> accordingly.
>>>>
>>>>
>>>> I didn't check all of the possible output but the majority uses space instead
>>>> of tabs. Check psql. If you check the git history (git log --grep=tabs), you
>>>> will notice that tabs are removed from source code.
>>>>
>>>>
>>>>
>>>> We should follow the pretty printing style in ruleutils.c, which uses
>>>> spaces.
>>>>
>>>>
>>>> * permission. I don't think you need to check for permissions inside the
>>>> function. I wouldn't want a different behavior than pg_dump(all). You can
>>>> always adjust it in system_functions.sql.
>>>>
>>>>
>>>> We’ve already had extensive discussions on this topic in the same
>>>> email thread, and ultimately we decided to add the permission check.
>>>>
>>>>
>>>> That's fair. Again, I expect that all of these pg_get_OBJECT_ddl functions use
>>>> the same approach. We can always relax this restriction in the future.
>>>>
>>>>
>>>>
>>>> +1
>>>>
>>>>
>>>> cheers
>>>>
>>>>
>>>> andrew
>>>>
>>>> --
>>>> Andrew Dunstan
>>>> EDB: https://www.enterprisedb.com
>>>>
>>>>
| Attachment | Content-Type | Size |
|---|---|---|
| v8-0002-Add-pg_get_database_ddl-function-to-reconstruct-DefElem.patch | application/octet-stream | 14.5 KB |
| v8-0001-Add-pg_get_database_ddl-function-to-reconstruct-double-dash.patch | application/octet-stream | 26.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | VASUKI M | 2026-01-21 09:37:01 | Re: Optional skipping of unchanged relations during ANALYZE? |
| Previous Message | Nitin Motiani | 2026-01-21 08:53:46 | Re: Adding pg_dump flag for parallel export to pipes |