Re: Non-text mode for pg_dumpall

From: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
To: Vaibhav Dalvi <vaibhav(dot)dalvi(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Non-text mode for pg_dumpall
Date: 2025-11-27 09:19:14
Message-ID: CAKYtNAppx6y7M+9Fb5+TtTNhRS6onZo9kHeQS6yQF7PiK6qnEA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 27 Nov 2025 at 13:45, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
>
> Thanks Vaibhav for the review.
>
> On Tue, 18 Nov 2025 at 16:05, Vaibhav Dalvi
> <vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
> >
> > Hi Mahendra,
> >
> > Thanks Mahendra for working on this.
> >
> > Looks like my previous comment below is not addressed:
> > 1.
> >
> >> ### Use of Dump Options Structure (dopt)
> >> Please ensure consistency by utilizing the main dump options
> >> structure (`dopt`) instead of declaring and using individual variables
> >> where the structure already provides fields. For example, the
> >> `output_clean` variable seems redundant here:
> >> ```c
> >> case 'c':
> >> output_clean = true;
> >> dopt.outputClean = 1;
> >> break;
> >> ```
> >
>
> Fixed. output_clean was a global variable because it was used in 2
> functions. Now I am passing dopt. output_clean as function argument
> for another function.
>
> >
> > I agree that the output_clean variable is not added by your patch
> > but the introduction of dopt by your patch makes it redundant because
> > dopt has dopt.outputClean. Please look at below code from pg_dump.c
> > for the reference:
> >
> > case 'c': /* clean (i.e., drop) schema prior to create */
> > dopt.outputClean = 1;
> > break;
> > case 25:
> > dopt.restrict_key = pg_strdup(optarg);
> > break;
> >
> > 2.
> >
> >> ### 3\. Missing Example in SGML Documentation
> >> The SGML documentation for `pg_dumpall` is missing an explicit
> >> example demonstrating its use with non-text formats (e.g., directory format).
> >> It would be beneficial to include a clear example for this new feature.
> >
> >
> > I think pg_dumpall should have separate examples similar to pg_dump
> > rather than referencing the pg_dump example because pg_dumpall
> > doesn't have to mention the database name without -l or --database
> > in the command.
> >
>
> Fixed. Added some examples.
>
> > 3.
> >>
> >> > 1. Is the following change in `src/bin/pg_dump/connectdb.c` intentional?
> >>
> >> >
> >> > ```
> >> > --- a/src/bin/pg_dump/connectdb.c
> >> > +++ b/src/bin/pg_dump/connectdb.c
> >> Yes, we need this. If there is any error, then we were trying to
> >> disconnect the database in 2 places so we were getting a crash. I will
> >> try to reproduce crashe without this patch and will respond.
> >
> > Have you added a test case in the regression suite which fails if we remove
> > this particular change and works well with the change? or if possible could
> > you please demonstrate here at least.
>
> Fixed. With AH(archive), we should not free pointers by this exec call
> as we free this by exit_nicely hook. (we register AH by
> on_exit_close_archive).
>
> >
> > 4. The variable name append_data doesn't look meaningful to me.
> > Instead we can use append_database/append_databases?
> > because if this variable is set then we dump the databases along with
> > global objects. In case of pg_dump, append_data or data_only does make
> > sense to differentiate between schema and data but in case of pg_dumpall
> > if this variable is set then we're dumping schema as well as data i.e. in-short
> > the databases.
> >
>
> As of now, I am keeping this append_data as this was from an already
> committed patch.
>
> > ------------------------------------ pg_dumpall.c ----------------------------------------
> >
> > 5. The variable name formatName doesn't follow the naming convention of
> > variables available around it. I think use of format_name/formatname would
> > be better.
> >
> >> char *use_role = NULL;
> >> const char *dumpencoding = NULL;
> >> + const char *formatName = "p";
> >> trivalue prompt_password = TRI_DEFAULT;
> >> bool data_only = false;
> >> bool globals_only = false;
> >
>
> Fixed.
>
> >
> > ------------------------------------ pg_restore.c ----------------------------------------
> >
> > 6. Fourth parameter (i.e. append_data) to function restore_global_objects() is redundant.
> > All the time value provided by all callers to this parameter is false.
> >
> > I would suggest removing this parameter and in the definition of this function
> > call function restore_one_database() with false as 4th argument. Find diff below:
> >
>
> Fixed.
>
> > --- a/src/bin/pg_dump/pg_restore.c
> > +++ b/src/bin/pg_dump/pg_restore.c
> > @@ -64,8 +64,7 @@ static int restore_one_database(const char *inputFileSpec, RestoreOptions *opts,
> > int numWorkers, bool append_data, int num,
> > bool globals_only);
> > static int restore_global_objects(const char *inputFileSpec,
> > - RestoreOptions *opts, int numWorkers, bool append_data,
> > - int num, bool globals_only);
> > + RestoreOptions *opts, int numWorkers, int num, bool globals_only);
> > static int restore_all_databases(const char *inputFileSpec,
> > SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
> > static int get_dbnames_list_to_restore(PGconn *conn,
> > @@ -554,7 +553,7 @@ main(int argc, char **argv)
> >
> > /* Set path for toc.glo file. */
> > snprintf(global_path, MAXPGPATH, "%s/toc.glo", inputFileSpec);
> > - n_errors = restore_global_objects(global_path, opts, numWorkers, false, 0, globals_only);
> > + n_errors = restore_global_objects(global_path, opts, numWorkers, 0, globals_only);
> >
> > pg_log_info("database restoring skipped because option -g/--globals-only was specified");
> > }
> > @@ -602,7 +601,7 @@ main(int argc, char **argv)
> > * If globals_only is set, then skip DROP DATABASE commands from restore.
> > */
> > static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opts,
> > - int numWorkers, bool append_data, int num, bool globals_only)
> > + int numWorkers, int num, bool globals_only)
> > {
> > int nerror;
> > int format = opts->format;
> > @@ -610,8 +609,8 @@ static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opt
> > /* Set format as custom so that toc.glo file can be read. */
> > opts->format = archCustom;
> >
> > - nerror = restore_one_database(inputFileSpec, opts, numWorkers,
> > - append_data, num, globals_only);
> > + nerror = restore_one_database(inputFileSpec, opts, numWorkers, false, num,
> > + globals_only);
> >
> > /* Reset format value. */
> > opts->format = format;
> > @@ -1097,7 +1096,7 @@ restore_all_databases(const char *inputFileSpec,
> >
> > /* If map.dat has no entries, return after processing global commands. */
> > if (dbname_oid_list.head == NULL)
> > - return restore_global_objects(global_path, opts, numWorkers, false, 0, false);
> > + return restore_global_objects(global_path, opts, numWorkers, 0, false);
> >
> > pg_log_info(ngettext("found %d database name in \"%s\"",
> > "found %d database names in \"%s\"",
> > @@ -1151,7 +1150,7 @@ restore_all_databases(const char *inputFileSpec,
> > PQfinish(conn);
> >
> > /* Open toc.dat file and execute/append all the global sql commands. */
> > - n_errors_total = restore_global_objects(global_path, opts, numWorkers, false, 0, false);
> > + n_errors_total = restore_global_objects(global_path, opts, numWorkers, 0, false);
> >
> > Regression is successful with these changes.
> >
> > 7. Fix indentation:
> >>
> >> static int restore_global_objects(const char *inputFileSpec,
> >> RestoreOptions *opts, int numWorkers, bool append_data,
> >> int num, bool globals_only);
> >> static int restore_all_databases(const char *inputFileSpec,
> >> SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
>
> Fixed some.
>
> >
> >
> > 8. Remove extra line:
> >>
> >> +
> >> static void usage(const char *progname);
> >
>
> Fixed.
>
> >
> > 9. Remove extra space after map.dat and before comma:
> >>
> >> + * databases from map.dat , but skip restoring those matching
> >
>
> Fixed.
>
> >
> > 10. Fix 80 char limits:
> >
> > + n_errors = restore_one_database(subdirpath, opts, numWorkers, true, 1, false);
> >
> > + num_total_db = get_dbname_oid_list_from_mfile(inputFileSpec, &dbname_oid_list);
> >
> > + return restore_global_objects(global_path, opts, numWorkers, false, 0, false);
> >
> > + n_errors_total = restore_global_objects(global_path, opts, numWorkers, false, 0, false);
> >
> > + pg_log_warning("errors ignored on database \"%s\" restore: %d", dbidname->str, n_errors);
> >
>
> Fixed some.
> I will do some more cleanup in the coming versions.
>
> Here, I am attaching an updated patch for the review and testing.
>
>
> >
> > Regards,
> > Vaibhav
> >
> > On Mon, Nov 17, 2025 at 10:45 PM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >>
> >> Thanks Andrew for the review.
> >> On Tue, 11 Nov 2025 at 20:41, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >> >
> >> >
> >> > On 2025-11-11 Tu 12:59 AM, Mahendra Singh Thalor wrote:
> >> > >
> >> > > Hi,
> >> > > Here, I am attaching an updated patch for the review and testing.
> >> > >
> >> > > FIX: as suggested by Vaibhav, added error for --restrict-key option
> >> > > with non-text format.
> >> > >
> >> >
> >> >
> >> > Regarding the name and format of the globals toc file, I'm inclined to
> >> > think we should always use custom format, regardless of whether the
> >> > individual databases will be in custom, tar or directory formats, and
> >> > that it should be called something distinguishable, e.g. toc.glo.
> >> >
> >>
> >> I also agree with your point. Fixed.
> >>
> >> On Mon, 17 Nov 2025 at 19:38, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com> wrote:
> >> >
> >> >
> >> >
> >> > On Tue, Nov 11, 2025 at 11:29 AM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >> >>
> >> >> On Thu, 6 Nov 2025 at 11:03, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >> >> >
> >> >> > Thanks Vaibhav, Tushar and Andrew for the review and testing.
> >> >>
> >> >
> >> > Thanks Mahendra, getting this error against v07 series patch
> >> >
> >> > [edb(at)1a1c15437e7c bin]$ ./pg_dumpall -Ft -f tar.dumpc -v
> >> > pg_dumpall: executing SELECT pg_catalog.set_config('search_path', '', false);
> >> > pg_dumpall: pg_dumpall.c:2256: createOneArchiveEntry: Assertion `fout != ((void *)0)' failed.
> >> > Aborted
> >> >
> >> > regards,
> >>
> >> Thanks Tushar for the report. Fixed.
> >>
> >> Here, I am attaching an updated patch for the review and testing.
> >>
> >> --
> >> Thanks and Regards
> >> Mahendra Singh Thalor
> >> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Thanks and Regards
> Mahendra Singh Thalor
> EnterpriseDB: http://www.enterprisedb.com

Hi,
CI was reporting an error for an unused variable.

[08:37:07.338] user 0m14.312s
[08:37:07.338] sys 0m9.155s
[08:37:07.338] make -s -j${BUILD_JOBS} clean
[08:37:07.850] time make -s -j${BUILD_JOBS} world-bin
[08:37:17.443] pg_restore.c:1080:8: error: variable 'count' set but
not used [-Werror,-Wunused-but-set-variable]
[08:37:17.443] 1080 | int count = 0;
[08:37:17.443] | ^
[08:37:17.443] 1 error generated.
[08:37:17.443] make[3]: *** [<builtin>: pg_restore.o] Error 1
[08:37:17.443] make[3]: *** Waiting for unfinished jobs....
[08:37:17.708] make[2]: *** [Makefile:45: all-pg_dump-recurse] Error 2
[08:37:17.709] make[1]: *** [Makefile:42: all-bin-recurse] Error 2
[08:37:17.709] mak

Fixed. Here, I am attaching an updated patch for the review and testing.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v10_27112025-Non-text-modes-for-pg_dumpall-correspondingly-change.patch application/octet-stream 89.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message VASUKI M 2025-11-27 09:20:07 Re: [PATCH] psql: tab completion for ALTER ROLE ... IN DATABASE ...
Previous Message Amit Langote 2025-11-27 09:18:32 Re: Refactoring: Use soft error reporting for *_opt_overflow functions of date/timestamp