| From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
|---|---|
| To: | Euler Taveira <euler(at)eulerto(dot)com>, Akshay Joshi <akshay(dot)joshi(at)enterprisedb(dot)com> |
| Cc: | Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Quan Zongliang <quanzongliang(at)yeah(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] Add pg_get_database_ddl() function to reconstruct CREATE DATABASE statement |
| Date: | 2026-01-07 23:27:54 |
| Message-ID: | c2437767-c131-43a4-b779-37c391cf4fe3@dunslane.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2025-12-12 Fr 10:19 AM, Euler Taveira wrote:
> On Fri, Dec 12, 2025, at 7:52 AM, Akshay Joshi wrote:
>> On Thu, Dec 11, 2025 at 7:29 PM Euler Taveira<euler(at)eulerto(dot)com> wrote:
>> Is there any way to obtain the default values directly from the source
>> code itself, or do I need to refer to the documentation? If we rely on
>> the documentation and compare against that, then in the future, if the
>> default values change, we would also need to update our logic
>> accordingly.
>>
> No, you need to check the documentation. If you are changing the default value,
> you are breaking compatibility; that rarely happens. If we are really concern
> about this fact, you can add a test case that creates the object without
> properties (all default values) and another with all default properties and
> then compare the output.
Maybe the function should have a VERBOSE option that emits all the defaults.
>
>> Constantly having to check the documentation for default values may
>> feel annoying to some users. Some users run queries with parameters
>> such as encoding, connection limit, and locale using their default
>> values. When they call the pg_get_database_ddl function, it
>> reconstructs the short command based on those defaults.
>>
> Encoding and locale, ok but I doubt about connection limit.
>
> postgres=# SELECT current_user;
> current_user
> --------------
> euler
> (1 row)
>
> postgres=# CREATE DATABASE foo;
> CREATE DATABASE
> postgres=# CREATE DATABASE bar OWNER euler;
> CREATE DATABASE
>
> When you are learning a new command, you generally don't set the default value
> for a property just to be correct. I'm not saying this function shouldn't
> include OWNER. I'm just suggesting it to be optional. See some arguments
> below.
>
>>> * OWNER. There is no guarantee that the owner exists in the cluster you will
>>> use this output. That's something that pg_dumpall treats separately (see
>>> above). Does it mean we should include the owner? No. We can make it an
>>> option.
>>>
>> If I understand correctly, the owner should be an option provided by
>> the caller of the function, and we reconstruct the Database DDL using
>> that specified owner. Is that right?
>> If so, then in my humble opinion, this is not truly a reconstruction
>> of the existing database object.
>>
> No. My idea is to have something like the pg_dump --no-owner option. This is
> important if you are transporting the objects from one cluster to another one.
> Owner might be different. That's why I'm suggesting it should be optional. It
> means flexibility. See pg_dump output format that always apply the OWNER as a
> separate ALTER command.
+1
>
>>> * options. Since I mentioned options for some properties (owner, strategy,
>>> template), these properties can be accommodated as a VARIADIC argument. The
>>> function signature can be something like
>>>
>>> pg_get_database_ddl(oid, VARIADIC options text[])
>>>
>>> I would include the pretty print into options too.
>>>
>> Same comment as the one I gave for the Owner, if you are referring to
>> these as options to the function.
>>
> Let me elaborate a bit. As I suggested you can control the output with options.
> Why? Flexibility.
>
> Why am I suggesting such a general purpose implementation? See some of the use
> cases.
>
> 1. object DDL. Check DDL to recreate the object. It is not the exact DDL that
> the user informed but it produces the same result.
> 2. clone tool. Clone the objects to recreate the environment for another
> customer. These objects can be created in the same cluster or in another one.
> (Of course, global objects don't apply for the same cluster.)
> 3. dump tool. Dump the commands to recreate the existing objects.
> 4. diff tool. There are tools like pgquarrel [1] that queries the catalog and
> compare the results to create commands to turn the target database into the
> source database. The general purpose functions can be used if the object
> doesn't exist in the target database. (Of course, it doesn't apply for global
> objects but again it is a good UI to have all of these pg_get_OBJECT_ddl
> functions using the same approach.)
> 5. logical replication. These pg_get_OBJECT_ddl functions can be good
> candidates to be used in the initial schema replication and even in the DDL
> replication (if the object doesn't exist in the target database).
>
> The "options" parameter is to get the DDL command to serve any of these use
> cases. There are some properties in a certain object that you *don't* want for
> whatever reason. See some --no-OBJECT options in pg_dump. Let's say you don't
> want the TABLESPACE or the table access method while getting the CREATE TABLE
> DDL because it is different in the other database.
+1
>
>> I received a review comment suggesting the use of tabs. I also looked
>> up PostgreSQL best practices on google, which recommend using tabs for
>> indentation and spaces for alignment. I’m open to updating my code
>> accordingly.
>>
> I didn't check all of the possible output but the majority uses space instead
> of tabs. Check psql. If you check the git history (git log --grep=tabs), you
> will notice that tabs are removed from source code.
We should follow the pretty printing style in ruleutils.c, which uses
spaces.
>
>>> * permission. I don't think you need to check for permissions inside the
>>> function. I wouldn't want a different behavior than pg_dump(all). You can
>>> always adjust it in system_functions.sql.
>>>
>> We’ve already had extensive discussions on this topic in the same
>> email thread, and ultimately we decided to add the permission check.
>>
> That's fair. Again, I expect that all of these pg_get_OBJECT_ddl functions use
> the same approach. We can always relax this restriction in the future.
+1
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2026-01-07 23:46:46 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
| Previous Message | Nathan Bossart | 2026-01-07 23:27:31 | Re: improve performance of pg_dump with many sequences |