Re: Non-text mode for pg_dumpall

From: Vaibhav Dalvi <vaibhav(dot)dalvi(at)enterprisedb(dot)com>
To: Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Non-text mode for pg_dumpall
Date: 2025-11-05 13:16:38
Message-ID: CA+vB=AGsn4eUxsbLk_oy=iKzd8D_1Ne375XH-2u6Zncu72Q01Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Mahendra,

Here are a few more comments following my review of the patch:

### 1\. Incorrect Comment for `-g` (globals-only) Option

The comment for the `-g` case in the code states that it restores the
`global.dat` file. However, in the non-text dump output, I only see the
following files: `databases`, `map.dat`, and `toc.dat`.

```c
+ case 'g':
+ /* restore only global.dat file from directory */
+ globals_only = true;
+ break;
```

Please update this comment to accurately reflect the file being restored
(e.g., `toc.dat` or the global objects within the archive).

### 2\. Incorrect Order of `case` Statements in `pg_restore.c`

The new `case 7` statement in `pg_restore.c` appears to be
inserted before `case 6`, disrupting the numerical order.

```c
+ case 7: /* database patterns to skip */
+ simple_string_list_append(&db_exclude_patterns, optarg);
+ break;

case 6:
opts->restrict_key = pg_strdup(optarg);
```

Please re-order the `case` statements so they follow ascending
numerical order.

### 3\. Missing Example in SGML Documentation

The SGML documentation for `pg_dumpall` is missing an explicit
example demonstrating its use with non-text formats (e.g., directory
format).
It would be beneficial to include a clear example for this new feature.

### 4\. Cosmetic Issues

Please address the following minor stylistic points:

Please ensure the function signatures
adhere to standard coding style, particularly for line wrapping.
The following lines seem to have inconsistent indentation:

```c
static int restore_global_objects(const char *inputFileSpec, RestoreOptions
*opts,
int numWorkers, bool append_data, int num, bool globals_only);
static int restore_all_databases(const char *inputFileSpec,
SimpleStringList db_exclude_patterns, RestoreOptions *opts, int numWorkers);
```

Please fix instances where the 80-character line limit is
crossed, such as in the example below:

```c
n_errors = restore_one_database(subdirpath, opts, numWorkers, true, 1,
false);
```

I believe this concludes my formal review.

Thanks,
Vaibhav Dalvi

On Wed, Nov 5, 2025 at 12:29 PM Vaibhav Dalvi <
vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:

> Hi Mahendra,
>
> Thank you for the fix. Please find my further review comments below.
>
> ### Restrict-Key Option
>
> The `--restrict-key` option is currently being accepted by
> `pg_dumpall` even when non-plain formats are specified,
> which contradicts its intended use only with the plain format.
>
> For example:
>
> ```
> $ ./db/bin/pg_dump --format=d -f testdump_dir --restrict-key=RESTRICT_KEY
> pg_dump: error: option --restrict-key can only be used with --format=plain
> $ ./db/bin/pg_dumpall --format=d -f testdump_dir
> --restrict-key=RESTRICT_KEY
> pg_dumpall: error: invalid restrict key
> ```
>
> I have attached a delta patch that addresses the issue with the
> `--restrict-key` option. It would be beneficial to include a dedicated
> test case for this check.
>
> ### Use of Dump Options Structure (dopt)
>
> Please ensure consistency by utilizing the main dump options
> structure (`dopt`) instead of declaring and using individual variables
> where the structure already provides fields. For example, the
> `output_clean` variable seems redundant here:
>
> ```c
> case 'c':
> output_clean = true;
> dopt.outputClean = 1;
> break;
> ```
>
> In my attached delta file, I have replaced the unnecessary
> `restrict_key` variable with `dopt.restrict_key`.
>
> ### Cosmetic Issues
>
> 1. Please review the spacing around the pointer:
> ```c
> + ((ArchiveHandle * )fout) ->connection = conn;
> + ((ArchiveHandle * ) fout) -> public.numWorkers = 1;
> ```
> 2. Please be consistent with the punctuation of single-line comments;
> some end with a full stop (`.`) and others do not.
> 3. In the SGML documentation changes, some new statements start
> with one space, and others start with two. Please adhere to a single
> standard for indentation across the patch.
>
> Regards,
> Vaibhav
> EnterpriseDB
>
> On Mon, Nov 3, 2025 at 5:24 PM Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>
> wrote:
>
>> On Mon, 3 Nov 2025 at 12:06, Vaibhav Dalvi <
>> vaibhav(dot)dalvi(at)enterprisedb(dot)com> wrote:
>> >
>> > Hi Mahendra,
>> >
>> > Thank you for your work on this feature.
>> > I have just begun reviewing the latest patch and
>> > encountered the following errors during the initial setup:
>> >
>> > ```
>> > $ ./db/bin/pg_restore testdump_dir -C -d postgres -F d -p 5556
>> > pg_restore: error: could not execute query: ERROR: syntax error at or
>> near "\\"
>> > LINE 1: \restrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj...
>> > ^
>> > Command was: \restrict
>> aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
>> >
>> > pg_restore: error: could not execute query: ERROR: syntax error at or
>> near "\\"
>> > LINE 1: \unrestrict aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCj...
>> > ^
>> > Command was: \unrestrict
>> aO9K1gzVZTlafidF5fWx8ADGzUnIiAcguFz5qskGaFDygTCjCj9vg3Xxys1b3hb
>> >
>> > pg_restore: error: could not execute query: ERROR: syntax error at or
>> near "\\"
>> > LINE 1: \connect template1
>> > ^
>> > Command was: \connect template1
>> >
>> > pg_restore: error: could not execute query: ERROR: syntax error at or
>> near "\\"
>> > LINE 1: \connect postgres
>> > ^
>> > Command was: \connect postgres
>> > ```
>> > To cross-check tried with plain dump(with pg_dumpall) and
>> > restored(SQL file restore) without patch and didn't get above
>> > connection errors.
>> >
>> > It appears there might be an issue with the dump file itself.
>> > Please note that this is my first observation as I have just
>> > started the review. I will continue with my assessment.
>> >
>> > Regards,
>> > Vaibhav Dalvi
>> > EnterpriseDB
>>
>> Thanks Vaibhav for the review.
>> This change was added by me in v04. Only in the case of a file, we should
>> restore these commands. Attached patch is fixing the same.
>>
>> If we dump and restore the same file with the same user, then we will get
>> an error of ROLE CREATE as the same role is already created. I think,
>> either we can ignore this error, or we can keep it as a restore can be done
>> with different users.
>>
>>> mst(at)localhost bin]$ ./pg_restore d1 -C -d postgres
>>> pg_restore: error: could not execute query: ERROR: role "mst" already
>>> exists
>>> Command was: CREATE ROLE mst;
>>> ALTER ROLE mst WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN
>>> REPLICATION BYPASSRLS;
>>>
>>>
>>> pg_restore: warning: errors ignored on restore: 1
>>
>>
>>
>> >
>> > On Fri, Oct 31, 2025 at 2:51 PM Mahendra Singh Thalor <
>> mahi6run(at)gmail(dot)com> wrote:
>> >>
>> >> On Tue, 28 Oct 2025 at 11:32, Mahendra Singh Thalor <
>> mahi6run(at)gmail(dot)com> wrote:
>> >> >
>> >> > On Thu, 16 Oct 2025 at 16:24, Mahendra Singh Thalor <
>> mahi6run(at)gmail(dot)com> wrote:
>> >> > >
>> >> > > On Wed, 15 Oct 2025 at 23:05, Mahendra Singh Thalor <
>> mahi6run(at)gmail(dot)com> wrote:
>> >> > > >
>> >> > > > On Sun, 24 Aug 2025 at 22:12, Andrew Dunstan <
>> andrew(at)dunslane(dot)net> wrote:
>> >> > > > >
>> >> > > > >
>> >> > > > > On 2025-08-23 Sa 9:08 PM, Noah Misch wrote:
>> >> > > > >
>> >> > > > > On Wed, Jul 30, 2025 at 02:51:59PM -0400, Andrew Dunstan wrote:
>> >> > > > >
>> >> > > > > OK, now that's reverted we should discuss how to proceed. I
>> had two thoughts
>> >> > > > > - we could use invent a JSON format for the globals, or we
>> could just use
>> >> > > > > the existing archive format. I think the archive format is
>> pretty flexible,
>> >> > > > > and should be able to accommodate this. The downside is it's
>> not humanly
>> >> > > > > readable. The upside is that we don't need to do anything
>> special either to
>> >> > > > > write it or parse it.
>> >> > > > >
>> >> > > > > I would first try to use the existing archiver API, because
>> that makes it
>> >> > > > > harder to miss bugs. Any tension between that API and
>> pg_dumpall is likely to
>> >> > > > > have corresponding tension on the pg_restore side. Resolving
>> that tension
>> >> > > > > will reveal much of the project's scope that remained hidden
>> during the v18
>> >> > > > > attempt. Perhaps more important than that, using the archiver
>> API means
>> >> > > > > future pg_dump and pg_restore options are more likely to
>> cooperate properly
>> >> > > > > with $SUBJECT. In other words, I want it to be hard to add
>> pg_dump/pg_restore
>> >> > > > > features that malfunction only for $SUBJECT archives. The
>> strength of the
>> >> > > > > archiver architecture shows in how rarely new features need
>> format-specific
>> >> > > > > logic and how rarely format-specific bugs get reported. We've
>> had little or
>> >> > > > > no trouble with e.g. bugs that appear in -Fd but not in -Fc.
>> >> > > > >
>> >> > > > >
>> >> > > > > Yeah, that's what we're going to try.
>> >> > > > >
>> >> > > > >
>> >> > > > > cheers
>> >> > > > >
>> >> > > > >
>> >> > > > > andrew
>> >> > > > >
>> >> > > > > --
>> >> > > > > Andrew Dunstan
>> >> > > > > EDB: https://www.enterprisedb.com
>> >> > > >
>> >> > > > Thanks Andrew, Noah and all others for feedback.
>> >> > > >
>> >> > > > Based on the above suggestions and discussions, I removed sql
>> commands
>> >> > > > from the global.dat file. For global commands, now we are making
>> >> > > > toc.dat/toc.dmp/toc.tar file based on format specified and based
>> on
>> >> > > > format specified, we are making archive entries for these global
>> >> > > > commands. By this approach, we removed the hard-coded parsing
>> part of
>> >> > > > the global.dat file and we are able to skip DROP DATABASE with
>> the
>> >> > > > globals-only option.
>> >> > > >
>> >> > > > Here, I am attaching a patch for review, testing and feedback.
>> This is
>> >> > > > a WIP patch. I will do some more code cleanup and will add some
>> more
>> >> > > > comments also. Please review this and let me know design level
>> >> > > > feedback. Thanks Tushar Ahuja for some internal testing and
>> feedback.
>> >> > > >
>> >> > >
>> >> > > Hi,
>> >> > > Here, I am attaching an updated patch. In offline discussion,
>> Andrew
>> >> > > reported some test-case failures(Thanks Andrew). I fixed those.
>> >> > > Please let me know feedback for the patch.
>> >> > >
>> >> >
>> >> > Hi,
>> >> > Here I am attaching a re-based patch as v02 was failing on head.
>> >> > Thanks Tushar for the testing.
>> >> > Please review this and let me know feedback.
>> >> >
>> >>
>> >> Hi all,
>> >> Here I am attaching an updated patch for review and testing. Based on
>> >> some offline comments by Andrew, I did some code cleanup.
>> >> Please consider this patch for feedback.
>> >>
>> >> --
>> >> Thanks and Regards
>> >> Mahendra Singh Thalor
>> >> EnterpriseDB: http://www.enterprisedb.com
>>
>>
>>
>> --
>> Thanks and Regards
>> Mahendra Singh Thalor
>> EnterpriseDB: http://www.enterprisedb.com
>>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bryan Green 2025-11-05 13:38:23 Re: [PATCH] Fix fragile walreceiver test.
Previous Message Heikki Linnakangas 2025-11-05 12:59:15 Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue