Re: Non-text mode for pg_dumpall

From: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
To: Vaibhav Dalvi <vaibhav(dot)dalvi(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Non-text mode for pg_dumpall
Date: 2025-11-27 08:15:12
Message-ID: CAKYtNAoEUvYEG207zaGY0pEh6TB2sk6hpuz9LdG-fYEC=e2CgQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Vaibhav for the review.

On Tue, 18 Nov 2025 at 16:05, Vaibhav Dalvi
<vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
>
> Hi Mahendra,
>
> Thanks Mahendra for working on this.
>
> Looks like my previous comment below is not addressed:
> 1.
>
>> ### Use of Dump Options Structure (dopt)
>> Please ensure consistency by utilizing the main dump options
>> structure (`dopt`) instead of declaring and using individual variables
>> where the structure already provides fields. For example, the
>> `output_clean` variable seems redundant here:
>> ```c
>> case 'c':
>> output_clean = true;
>> dopt.outputClean = 1;
>> break;
>> ```
>

Fixed. output_clean was a global variable because it was used in 2
functions. Now I am passing dopt. output_clean as function argument
for another function.

>
> I agree that the output_clean variable is not added by your patch
> but the introduction of dopt by your patch makes it redundant because
> dopt has dopt.outputClean. Please look at below code from pg_dump.c
> for the reference:
>
> case 'c': /* clean (i.e., drop) schema prior to create */
> dopt.outputClean = 1;
> break;
> case 25:
> dopt.restrict_key = pg_strdup(optarg);
> break;
>
> 2.
>
>> ### 3\. Missing Example in SGML Documentation
>> The SGML documentation for `pg_dumpall` is missing an explicit
>> example demonstrating its use with non-text formats (e.g., directory format).
>> It would be beneficial to include a clear example for this new feature.
>
>
> I think pg_dumpall should have separate examples similar to pg_dump
> rather than referencing the pg_dump example because pg_dumpall
> doesn't have to mention the database name without -l or --database
> in the command.
>

Fixed. Added some examples.

> 3.
>>
>> > 1. Is the following change in `src/bin/pg_dump/connectdb.c` intentional?
>>
>> >
>> > ```
>> > --- a/src/bin/pg_dump/connectdb.c
>> > +++ b/src/bin/pg_dump/connectdb.c
>> Yes, we need this. If there is any error, then we were trying to
>> disconnect the database in 2 places so we were getting a crash. I will
>> try to reproduce crashe without this patch and will respond.
>
> Have you added a test case in the regression suite which fails if we remove
> this particular change and works well with the change? or if possible could
> you please demonstrate here at least.

Fixed. With AH(archive), we should not free pointers by this exec call
as we free this by exit_nicely hook. (we register AH by
on_exit_close_archive).

>
> 4. The variable name append_data doesn't look meaningful to me.
> Instead we can use append_database/append_databases?
> because if this variable is set then we dump the databases along with
> global objects. In case of pg_dump, append_data or data_only does make
> sense to differentiate between schema and data but in case of pg_dumpall
> if this variable is set then we're dumping schema as well as data i.e. in-short
> the databases.
>

As of now, I am keeping this append_data as this was from an already
committed patch.

> ------------------------------------ pg_dumpall.c ----------------------------------------
>
> 5. The variable name formatName doesn't follow the naming convention of
> variables available around it. I think use of format_name/formatname would
> be better.
>
>> char *use_role = NULL;
>> const char *dumpencoding = NULL;
>> + const char *formatName = "p";
>> trivalue prompt_password = TRI_DEFAULT;
>> bool data_only = false;
>> bool globals_only = false;
>

Fixed.

>
> ------------------------------------ pg_restore.c ----------------------------------------
>
> 6. Fourth parameter (i.e. append_data) to function restore_global_objects() is redundant.
> All the time value provided by all callers to this parameter is false.
>
> I would suggest removing this parameter and in the definition of this function
> call function restore_one_database() with false as 4th argument. Find diff below:
>

Fixed.

> --- a/src/bin/pg_dump/pg_restore.c
> +++ b/src/bin/pg_dump/pg_restore.c
> @@ -64,8 +64,7 @@ static int restore_one_database(const char *inputFileSpec, RestoreOptions *opts,
> int numWorkers, bool append_data, int num,
> bool globals_only);
> static int restore_global_objects(const char *inputFileSpec,
> - RestoreOptions *opts, int numWorkers, bool append_data,
> - int num, bool globals_only);
> + RestoreOptions *opts, int numWorkers, int num, bool globals_only);
> static int restore_all_databases(const char *inputFileSpec,
> SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
> static int get_dbnames_list_to_restore(PGconn *conn,
> @@ -554,7 +553,7 @@ main(int argc, char **argv)
>
> /* Set path for toc.glo file. */
> snprintf(global_path, MAXPGPATH, "%s/toc.glo", inputFileSpec);
> - n_errors = restore_global_objects(global_path, opts, numWorkers, false, 0, globals_only);
> + n_errors = restore_global_objects(global_path, opts, numWorkers, 0, globals_only);
>
> pg_log_info("database restoring skipped because option -g/--globals-only was specified");
> }
> @@ -602,7 +601,7 @@ main(int argc, char **argv)
> * If globals_only is set, then skip DROP DATABASE commands from restore.
> */
> static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opts,
> - int numWorkers, bool append_data, int num, bool globals_only)
> + int numWorkers, int num, bool globals_only)
> {
> int nerror;
> int format = opts->format;
> @@ -610,8 +609,8 @@ static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opt
> /* Set format as custom so that toc.glo file can be read. */
> opts->format = archCustom;
>
> - nerror = restore_one_database(inputFileSpec, opts, numWorkers,
> - append_data, num, globals_only);
> + nerror = restore_one_database(inputFileSpec, opts, numWorkers, false, num,
> + globals_only);
>
> /* Reset format value. */
> opts->format = format;
> @@ -1097,7 +1096,7 @@ restore_all_databases(const char *inputFileSpec,
>
> /* If map.dat has no entries, return after processing global commands. */
> if (dbname_oid_list.head == NULL)
> - return restore_global_objects(global_path, opts, numWorkers, false, 0, false);
> + return restore_global_objects(global_path, opts, numWorkers, 0, false);
>
> pg_log_info(ngettext("found %d database name in \"%s\"",
> "found %d database names in \"%s\"",
> @@ -1151,7 +1150,7 @@ restore_all_databases(const char *inputFileSpec,
> PQfinish(conn);
>
> /* Open toc.dat file and execute/append all the global sql commands. */
> - n_errors_total = restore_global_objects(global_path, opts, numWorkers, false, 0, false);
> + n_errors_total = restore_global_objects(global_path, opts, numWorkers, 0, false);
>
> Regression is successful with these changes.
>
> 7. Fix indentation:
>>
>> static int restore_global_objects(const char *inputFileSpec,
>> RestoreOptions *opts, int numWorkers, bool append_data,
>> int num, bool globals_only);
>> static int restore_all_databases(const char *inputFileSpec,
>> SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);

Fixed some.

>
>
> 8. Remove extra line:
>>
>> +
>> static void usage(const char *progname);
>

Fixed.

>
> 9. Remove extra space after map.dat and before comma:
>>
>> + * databases from map.dat , but skip restoring those matching
>

Fixed.

>
> 10. Fix 80 char limits:
>
> + n_errors = restore_one_database(subdirpath, opts, numWorkers, true, 1, false);
>
> + num_total_db = get_dbname_oid_list_from_mfile(inputFileSpec, &dbname_oid_list);
>
> + return restore_global_objects(global_path, opts, numWorkers, false, 0, false);
>
> + n_errors_total = restore_global_objects(global_path, opts, numWorkers, false, 0, false);
>
> + pg_log_warning("errors ignored on database \"%s\" restore: %d", dbidname->str, n_errors);
>

Fixed some.
I will do some more cleanup in the coming versions.

Here, I am attaching an updated patch for the review and testing.

>
> Regards,
> Vaibhav
>
> On Mon, Nov 17, 2025 at 10:45 PM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
>>
>> Thanks Andrew for the review.
>> On Tue, 11 Nov 2025 at 20:41, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> >
>> >
>> > On 2025-11-11 Tu 12:59 AM, Mahendra Singh Thalor wrote:
>> > >
>> > > Hi,
>> > > Here, I am attaching an updated patch for the review and testing.
>> > >
>> > > FIX: as suggested by Vaibhav, added error for --restrict-key option
>> > > with non-text format.
>> > >
>> >
>> >
>> > Regarding the name and format of the globals toc file, I'm inclined to
>> > think we should always use custom format, regardless of whether the
>> > individual databases will be in custom, tar or directory formats, and
>> > that it should be called something distinguishable, e.g. toc.glo.
>> >
>>
>> I also agree with your point. Fixed.
>>
>> On Mon, 17 Nov 2025 at 19:38, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com> wrote:
>> >
>> >
>> >
>> > On Tue, Nov 11, 2025 at 11:29 AM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
>> >>
>> >> On Thu, 6 Nov 2025 at 11:03, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
>> >> >
>> >> > Thanks Vaibhav, Tushar and Andrew for the review and testing.
>> >>
>> >
>> > Thanks Mahendra, getting this error against v07 series patch
>> >
>> > [edb(at)1a1c15437e7c bin]$ ./pg_dumpall -Ft -f tar.dumpc -v
>> > pg_dumpall: executing SELECT pg_catalog.set_config('search_path', '', false);
>> > pg_dumpall: pg_dumpall.c:2256: createOneArchiveEntry: Assertion `fout != ((void *)0)' failed.
>> > Aborted
>> >
>> > regards,
>>
>> Thanks Tushar for the report. Fixed.
>>
>> Here, I am attaching an updated patch for the review and testing.
>>
>> --
>> Thanks and Regards
>> Mahendra Singh Thalor
>> EnterpriseDB: http://www.enterprisedb.com

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v09_27112025-Non-text-modes-for-pg_dumpall-correspondingly-change.patch application/octet-stream 89.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rahila Syed 2025-11-27 08:21:39 Re: show size of DSAs and dshash tables in pg_dsm_registry_allocations
Previous Message Nazir Bilal Yavuz 2025-11-27 08:00:03 Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache