| From: | Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> |
|---|---|
| To: | Vaibhav Dalvi <vaibhav(dot)dalvi(at)enterprisedb(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Non-text mode for pg_dumpall |
| Date: | 2025-11-11 05:59:04 |
| Message-ID: | CAKYtNAoSTBqR24cn3XDOnwk2DCL+nAUkjpB5Xkz1H74rDqa-aQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, 6 Nov 2025 at 11:03, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
>
> Thanks Vaibhav, Tushar and Andrew for the review and testing.
>
> On Mon, 3 Nov 2025 at 17:30, Vaibhav Dalvi
> <vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
> >
> > Hi Mahendra,
> >
> > I have a few more review comments regarding the patch:
> >
> > 1. Is the following change in `src/bin/pg_dump/connectdb.c` intentional?
> >
> > ```
> > --- a/src/bin/pg_dump/connectdb.c
> > +++ b/src/bin/pg_dump/connectdb.c
>
> Yes, we need this. If there is any error, then we were trying to
> disconnect the database in 2 places so we were getting a crash. I will
> try to reproduce crashe without this patch and will respond.
>
> On Tue, 4 Nov 2025 at 18:23, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com> wrote:
> > Thanks Mahendra, I am getting a segmentation fault against v05 patch.
> >
> > [edb(at)1a1c15437e7c bin]$ ./pg_dumpall -Ft --file a.3 -v
> > pg_dumpall: executing SELECT pg_catalog.set_config('search_path', '', false);
> > Segmentation fault
> >
> > Issue is coming with all output file formats -F[t/c/d] except plain
> >
> > regards,
>
> Thanks for the report. Fixed,
>
> On Tue, 4 Nov 2025 at 22:25, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> > Yeah, I don't think we need to dump the timestamp in non-text modes. This fix worked for me:
> >
> >
> > diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
> > index 601b9f9738e..f66cc26d9a2 100644
> > --- a/src/bin/pg_dump/pg_dumpall.c
> > +++ b/src/bin/pg_dump/pg_dumpall.c
> > @@ -638,7 +638,7 @@ main(int argc, char *argv[])
> > if (quote_all_identifiers)
> > executeCommand(conn, "SET quote_all_identifiers = true");
> >
> > - if (verbose)
> > + if (verbose && archDumpFormat == archNull)
> > dumpTimestamp("Started on");
>
> Thanks Andrew. Yes, we should not dump timestamp in non-text modes.
>
> On Wed, 5 Nov 2025 at 18:47, Vaibhav Dalvi
> <vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
> >
> > Hi Mahendra,
> >
> > Here are a few more comments following my review of the patch:
> >
> > ### 1\. Incorrect Comment for `-g` (globals-only) Option
> >
> > The comment for the `-g` case in the code states that it restores the
> > `global.dat` file. However, in the non-text dump output, I only see the
> > following files: `databases`, `map.dat`, and `toc.dat`.
>
> Fixed.
>
> >
> > ```c
> > + case 'g':
> > + /* restore only global.dat file from directory */
> > + globals_only = true;
> > + break;
>
> Fixed.
>
> > ```
> >
> > Please update this comment to accurately reflect the file being restored
> > (e.g., `toc.dat` or the global objects within the archive).
>
> Fixed.
>
> >
> > ### 2\. Incorrect Order of `case` Statements in `pg_restore.c`
> >
> > The new `case 7` statement in `pg_restore.c` appears to be
> > inserted before `case 6`, disrupting the numerical order.
> >
> > ```c
> > + case 7: /* database patterns to skip */
> > + simple_string_list_append(&db_exclude_patterns, optarg);
> > + break;
> >
> > case 6:
> > opts->restrict_key = pg_strdup(optarg);
> > ```
> >
> > Please re-order the `case` statements so they follow ascending
> > numerical order.
>
> Fixed.
>
> >
> > ### 3\. Missing Example in SGML Documentation
> >
> > The SGML documentation for `pg_dumpall` is missing an explicit
> > example demonstrating its use with non-text formats (e.g., directory format).
> > It would be beneficial to include a clear example for this new feature.
>
> I think we don't add such cases in doc. We already added test cases in
> code. If others also feel that we should add a test case in SGML, then
> I will update the doc with the test case.
>
> >
> > ### 4\. Cosmetic Issues
> >
> > Please address the following minor stylistic points:
> >
> > Please ensure the function signatures
> > adhere to standard coding style, particularly for line wrapping.
> > The following lines seem to have inconsistent indentation:
> >
> > ```c
> > static int restore_global_objects(const char *inputFileSpec, RestoreOptions *opts,
> > int numWorkers, bool append_data, int num, bool globals_only);
> > static int restore_all_databases(const char *inputFileSpec,
> > SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
> > ```
> >
> > Please fix instances where the 80-character line limit is
> > crossed, such as in the example below:
>
> Fixed.
>
> >
> > ```c
> > n_errors = restore_one_database(subdirpath, opts, numWorkers, true, 1, false);
> > ```
> >
> > I believe this concludes my formal review.
> >
> > Thanks,
> > Vaibhav Dalvi
> >
> > On Wed, Nov 5, 2025 at 12:29 PM Vaibhav Dalvi <vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
> >>
> >> Hi Mahendra,
> >>
> >> Thank you for the fix. Please find my further review comments below.
> >>
> >> ### Restrict-Key Option
> >>
> >> The `--restrict-key` option is currently being accepted by
> >> `pg_dumpall` even when non-plain formats are specified,
> >> which contradicts its intended use only with the plain format.
> >>
> >> For example:
> >>
> >> ```
> >> $ ./db/bin/pg_dump --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
> >> pg_dump: error: option --restrict-key can only be used with --format=plain
> >> $ ./db/bin/pg_dumpall --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
> >> pg_dumpall: error: invalid restrict key
> >> ```
> >>
> >> I have attached a delta patch that addresses the issue with the
> >> `--restrict-key` option. It would be beneficial to include a dedicated
> >> test case for this check.
>
> We should dump restrict-key with all modes as we need to restore with
> the "-f file" option in text mode.
> Ex: pg_dumpall --format=d -f testdump_dir
> and restore::: pg_restore testdump_dir -d dabasename -C -f testdumpfile
> (In testdumpfile, we will generate commands from archive dump)
>
> So I am not merging this delat patch.
>
> >>
> >> ### Use of Dump Options Structure (dopt)
> >>
> >> Please ensure consistency by utilizing the main dump options
> >> structure (`dopt`) instead of declaring and using individual variables
> >> where the structure already provides fields. For example, the
> >> `output_clean` variable seems redundant here:
> >>
> >> ```c
> >> case 'c':
> >> output_clean = true;
> >> dopt.outputClean = 1;
> >> break;
>
> output_clean is not added by this patch. I will analyse this comment
> and will respond in the next update.
>
> >> ```
> >>
> >> In my attached delta file, I have replaced the unnecessary
> >> `restrict_key` variable with `dopt.restrict_key`.
>
> This is also not part of this patch. If you feel to add this in DOPT,
> please suggest in separate thread.
>
> >>
> >> ### Cosmetic Issues
> >>
> >> 1. Please review the spacing around the pointer:
> >> ```c
> >> + ((ArchiveHandle * )fout) ->connection = conn;
> >> + ((ArchiveHandle * ) fout) -> public.numWorkers = 1;
>
> Fixed.
>
> >> ```
> >> 2. Please be consistent with the punctuation of single-line comments;
> >> some end with a full stop (`.`) and others do not.
>
> Based on nearby code comments, I made changes. I will try to fix these
> inconsistencies..
>
>
> >> 3. In the SGML documentation changes, some new statements start
> >> with one space, and others start with two. Please adhere to a single
> >> standard for indentation across the patch.
>
> Okay. I will fix these.
>
> >>
> >> Regards,
> >> Vaibhav
> >> EnterpriseDB
> >>
> >> On Mon, Nov 3, 2025 at 5:24 PM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >>>
> >>> On Mon, 3 Nov 2025 at 12:06, Vaibhav Dalvi <vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
> >>> >
> >>> > Hi Mahendra,
> >>> >
> >>> > Thank you for your work on this feature.
> >>> > I have just begun reviewing the latest patch and
> >>> > encountered the following errors during the initial setup:
> >>> >
> >>> > ```
> >>> > $ ./db/bin/pg_restore testdump_dir -C -d postgres -F d -p 5556
> >>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> >>> > LINE 1: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj...
> >>> > ^
> >>> > Command was: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
> >>> >
> >>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> >>> > LINE 1: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCj...
> >>> > ^
> >>> > Command was: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
> >>> >
> >>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> >>> > LINE 1: \connect template1
> >>> > ^
> >>> > Command was: \connect template1
> >>> >
> >>> > pg_restore: error: could not execute query: ERROR: syntax error at or near "\\"
> >>> > LINE 1: \connect postgres
> >>> > ^
> >>> > Command was: \connect postgres
> >>> > ```
> >>> > To cross-check tried with plain dump(with pg_dumpall) and
> >>> > restored(SQL file restore) without patch and didn't get above
> >>> > connection errors.
> >>> >
> >>> > It appears there might be an issue with the dump file itself.
> >>> > Please note that this is my first observation as I have just
> >>> > started the review. I will continue with my assessment.
> >>> >
> >>> > Regards,
> >>> > Vaibhav Dalvi
> >>> > EnterpriseDB
> >>>
> >>> Thanks Vaibhav for the review.
> >>> This change was added by me in v04. Only in the case of a file, we should restore these commands. Attached patch is fixing the same.
> >>>
> >>> If we dump and restore the same file with the same user, then we will get an error of ROLE CREATE as the same role is already created. I think, either we can ignore this error, or we can keep it as a restore can be done with different users.
> >>>>
> >>>> mst(at)localhost bin]$ ./pg_restore d1 -C -d postgres
> >>>> pg_restore: error: could not execute query: ERROR: role "mst" already exists
> >>>> Command was: CREATE ROLE mst;
> >>>> ALTER ROLE mst WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS;
> >>>>
> >>>>
> >>>> pg_restore: warning: errors ignored on restore: 1
> >>>
> >>>
> >>>
> >>> >
> >>> > On Fri, Oct 31, 2025 at 2:51 PM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >>> >>
> >>> >> On Tue, 28 Oct 2025 at 11:32, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >>> >> >
> >>> >> > On Thu, 16 Oct 2025 at 16:24, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >>> >> > >
> >>> >> > > On Wed, 15 Oct 2025 at 23:05, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> wrote:
> >>> >> > > >
> >>> >> > > > On Sun, 24 Aug 2025 at 22:12, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > On 2025-08-23 Sa 9:08 PM, Noah Misch wrote:
> >>> >> > > > >
> >>> >> > > > > On Wed, Jul 30, 2025 at 02:51:59PM -0400, Andrew Dunstan wrote:
> >>> >> > > > >
> >>> >> > > > > OK, now that's reverted we should discuss how to proceed. I had two thoughts
> >>> >> > > > > - we could use invent a JSON format for the globals, or we could just use
> >>> >> > > > > the existing archive format. I think the archive format is pretty flexible,
> >>> >> > > > > and should be able to accommodate this. The downside is it's not humanly
> >>> >> > > > > readable. The upside is that we don't need to do anything special either to
> >>> >> > > > > write it or parse it.
> >>> >> > > > >
> >>> >> > > > > I would first try to use the existing archiver API, because that makes it
> >>> >> > > > > harder to miss bugs. Any tension between that API and pg_dumpall is likely to
> >>> >> > > > > have corresponding tension on the pg_restore side. Resolving that tension
> >>> >> > > > > will reveal much of the project's scope that remained hidden during the v18
> >>> >> > > > > attempt. Perhaps more important than that, using the archiver API means
> >>> >> > > > > future pg_dump and pg_restore options are more likely to cooperate properly
> >>> >> > > > > with $SUBJECT. In other words, I want it to be hard to add pg_dump/pg_restore
> >>> >> > > > > features that malfunction only for $SUBJECT archives. The strength of the
> >>> >> > > > > archiver architecture shows in how rarely new features need format-specific
> >>> >> > > > > logic and how rarely format-specific bugs get reported. We've had little or
> >>> >> > > > > no trouble with e.g. bugs that appear in -Fd but not in -Fc.
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > Yeah, that's what we're going to try.
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > cheers
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > andrew
> >>> >> > > > >
> >>> >> > > > > --
> >>> >> > > > > Andrew Dunstan
> >>> >> > > > > EDB: https://www.enterprisedb.com
> >>> >> > > >
> >>> >> > > > Thanks Andrew, Noah and all others for feedback.
> >>> >> > > >
> >>> >> > > > Based on the above suggestions and discussions, I removed sql commands
> >>> >> > > > from the global.dat file. For global commands, now we are making
> >>> >> > > > toc.dat/toc.dmp/toc.tar file based on format specified and based on
> >>> >> > > > format specified, we are making archive entries for these global
> >>> >> > > > commands. By this approach, we removed the hard-coded parsing part of
> >>> >> > > > the global.dat file and we are able to skip DROP DATABASE with the
> >>> >> > > > globals-only option.
> >>> >> > > >
> >>> >> > > > Here, I am attaching a patch for review, testing and feedback. This is
> >>> >> > > > a WIP patch. I will do some more code cleanup and will add some more
> >>> >> > > > comments also. Please review this and let me know design level
> >>> >> > > > feedback. Thanks Tushar Ahuja for some internal testing and feedback.
> >>> >> > > >
> >>> >> > >
> >>> >> > > Hi,
> >>> >> > > Here, I am attaching an updated patch. In offline discussion, Andrew
> >>> >> > > reported some test-case failures(Thanks Andrew). I fixed those.
> >>> >> > > Please let me know feedback for the patch.
> >>> >> > >
> >>> >> >
> >>> >> > Hi,
> >>> >> > Here I am attaching a re-based patch as v02 was failing on head.
> >>> >> > Thanks Tushar for the testing.
> >>> >> > Please review this and let me know feedback.
> >>> >> >
> >>> >>
> >>> >> Hi all,
> >>> >> Here I am attaching an updated patch for review and testing. Based on
> >>> >> some offline comments by Andrew, I did some code cleanup.
> >>> >> Please consider this patch for feedback.
> >>> >>
> >>> >> --
> >>> >> Thanks and Regards
> >>> >> Mahendra Singh Thalor
> >>> >> EnterpriseDB: http://www.enterprisedb.com
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks and Regards
> >>> Mahendra Singh Thalor
> >>> EnterpriseDB: http://www.enterprisedb.com
>
> Here, I am attaching an updated patch for the review and testing.
>
> --
> Thanks and Regards
> Mahendra Singh Thalor
> EnterpriseDB: http://www.enterprisedb.com
Hi,
Here, I am attaching an updated patch for the review and testing.
FIX: as suggested by Vaibhav, added error for --restrict-key option
with non-text format.
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
| Attachment | Content-Type | Size |
|---|---|---|
| v07_11112025-Non-text-modes-for-pg_dumpall-correspondingly-change.patch | application/octet-stream | 84.4 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-11-11 06:13:44 | Re: [PATCH] Add hints for invalid binary encoding names in encode/decode functions |
| Previous Message | Chao Li | 2025-11-11 05:57:33 | Re: Should we say "wal_level = logical" instead of "wal_level >= logical" |