From: | Mahendra Singh Thalor <mahi6run(at)gmail(dot)com> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Andrew Dunstan <andrew(at)dunslane(dot)net>, Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>, Srinath Reddy <srinath2133(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Non-text mode for pg_dumpall |
Date: | 2025-07-09 18:51:03 |
Message-ID: | CAKYtNAr-6_CCz+tnJ0T2597Ar4iAZXkE1qPimsO9EY-Kn7LvzQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thanks Noah for the comments.
On Wed, 9 Jul 2025 at 02:58, Noah Misch <noah(at)leadboat(dot)com> wrote:
>
> On Fri, Apr 04, 2025 at 04:11:05PM -0400, Andrew Dunstan wrote:
> > Thanks. I have pushed these now with a few further small tweaks.
>
> This drops all databases:
>
> pg_dumpall --clean -Fd -f /tmp/dump
> pg_restore -d template1 --globals-only /tmp/dump
>
> That didn't match my expectations given this help text:
>
> $ pg_restore --help|grep global
> -g, --globals-only restore only global objects, no databases
Databases are global objects so due to --clean command, we are putting
drop commands in global.dat for all the databases. While restoring, we
used the "--globals-only" option so we are dropping all these
databases by global.dat file.
Please let us know your expectations for this specific case.
>
> This happens in dropDBs(). I found that by searching pg_dumpall.c for "OPF",
> which finds all the content we can write to globals.dat.
>
> commit 1495eff wrote:
> > --- a/src/bin/pg_dump/pg_dumpall.c
> > +++ b/src/bin/pg_dump/pg_dumpall.c
>
> > @@ -1612,9 +1683,27 @@ dumpDatabases(PGconn *conn)
> > continue;
> > }
> >
> > + /*
> > + * If this is not a plain format dump, then append dboid and dbname to
> > + * the map.dat file.
> > + */
> > + if (archDumpFormat != archNull)
> > + {
> > + if (archDumpFormat == archCustom)
> > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".dmp", db_subdir, oid);
> > + else if (archDumpFormat == archTar)
> > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\".tar", db_subdir, oid);
> > + else
> > + snprintf(dbfilepath, MAXPGPATH, "\"%s\"/\"%s\"", db_subdir, oid);
>
> Use appendShellString() instead. Plain mode already does that for the
> "pg_dumpall -f" argument, which is part of db_subdir here. We don't want
> weird filename characters to work out differently for plain vs. non-plain
> mode. Also, it's easier to search for appendShellString() than to search for
> open-coded shell quoting.
Yes, we can use appendShellString also. We are using snprintf in the
pg_dump.c file also.
Ex: snprintf(tagbuf, sizeof(tagbuf), "LARGE OBJECTS %u..%u",
loinfo->looids[0], loinfo->looids[loinfo->numlos - 1]);
If we want to use appendShellString, I can write a patch for these.
Please let me know your opinion.
>
> > @@ -1641,19 +1727,30 @@ dumpDatabases(PGconn *conn)
> > if (filename)
> > fclose(OPF);
> >
> > - ret = runPgDump(dbname, create_opts);
> > + ret = runPgDump(dbname, create_opts, dbfilepath, archDumpFormat);
> > if (ret != 0)
> > pg_fatal("pg_dump failed on database \"%s\", exiting", dbname);
> >
> > if (filename)
> > {
> > - OPF = fopen(filename, PG_BINARY_A);
> > + char global_path[MAXPGPATH];
> > +
> > + if (archDumpFormat != archNull)
> > + snprintf(global_path, MAXPGPATH, "%s/global.dat", filename);
> > + else
> > + snprintf(global_path, MAXPGPATH, "%s", filename);
> > +
> > + OPF = fopen(global_path, PG_BINARY_A);
> > if (!OPF)
> > pg_fatal("could not re-open the output file \"%s\": %m",
> > - filename);
> > + global_path);
>
> Minor item: plain mode benefits from reopening, because pg_dump appended to
> the plain output file. There's no analogous need to reopen global.dat, since
> just this one process writes to global.dat.
yes, only once we need to open global.dat file but to keep simple
code, we kept old code.
>
> > @@ -1672,17 +1770,36 @@ runPgDump(const char *dbname, const char *create_opts)
> > initPQExpBuffer(&connstrbuf);
> > initPQExpBuffer(&cmd);
> >
> > - printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin,
> > - pgdumpopts->data, create_opts);
> > -
> > /*
> > - * If we have a filename, use the undocumented plain-append pg_dump
> > - * format.
> > + * If this is not a plain format dump, then append file name and dump
> > + * format to the pg_dump command to get archive dump.
> > */
> > - if (filename)
> > - appendPQExpBufferStr(&cmd, " -Fa ");
> > + if (archDumpFormat != archNull)
> > + {
> > + printfPQExpBuffer(&cmd, "\"%s\" -f %s %s", pg_dump_bin,
> > + dbfile, create_opts);
> > +
> > + if (archDumpFormat == archDirectory)
> > + appendPQExpBufferStr(&cmd, " --format=directory ");
> > + else if (archDumpFormat == archCustom)
> > + appendPQExpBufferStr(&cmd, " --format=custom ");
> > + else if (archDumpFormat == archTar)
> > + appendPQExpBufferStr(&cmd, " --format=tar ");
> > + }
> > else
> > - appendPQExpBufferStr(&cmd, " -Fp ");
> > + {
> > + printfPQExpBuffer(&cmd, "\"%s\" %s %s", pg_dump_bin,
> > + pgdumpopts->data, create_opts);
>
> This uses pgdumpopts for plain mode only, so many pg_dumpall options silently
> have no effect in non-plain mode. Example:
>
> strace -f pg_dumpall --lock-wait-timeout=10 2>&1 >/dev/null | grep exec
> strace -f pg_dumpall --lock-wait-timeout=10 -Fd -f /tmp/dump3 2>&1 >/dev/null | grep exec
Agreed. We can add pgdumpopts->data to all the dump formats.
>
> > --- a/src/bin/pg_dump/pg_restore.c
> > +++ b/src/bin/pg_dump/pg_restore.c
>
> > +/*
> > + * read_one_statement
> > + *
> > + * This will start reading from passed file pointer using fgetc and read till
> > + * semicolon(sql statement terminator for global.dat file)
> > + *
> > + * EOF is returned if end-of-file input is seen; time to shut down.
>
> What makes it okay to use this particular subset of SQL lexing?
To support complex syntax, we used this code from another file.
>
> > +/*
> > + * get_dbnames_list_to_restore
> > + *
> > + * This will mark for skipping any entries from dbname_oid_list that pattern match an
> > + * entry in the db_exclude_patterns list.
> > + *
> > + * Returns the number of database to be restored.
> > + *
> > + */
> > +static int
> > +get_dbnames_list_to_restore(PGconn *conn,
> > + SimpleOidStringList *dbname_oid_list,
> > + SimpleStringList db_exclude_patterns)
> > +{
> > + int count_db = 0;
> > + PQExpBuffer query;
> > + PGresult *res;
> > +
> > + query = createPQExpBuffer();
> > +
> > + if (!conn)
> > + pg_log_info("considering PATTERN as NAME for --exclude-database option as no db connection while doing pg_restore.");
>
> When do we not have a connection here? We'd need to document this behavior
> variation if it stays, but I'd prefer if we can just rely on having a
> connection.
Yes, we can document this behavior.
>
> > + /* If database is already created, then don't set createDB flag. */
> > + if (opts->cparams.dbname)
> > + {
> > + PGconn *test_conn;
> > +
> > + test_conn = ConnectDatabase(db_cell->str, NULL, opts->cparams.pghost,
> > + opts->cparams.pgport, opts->cparams.username, TRI_DEFAULT,
> > + false, progname, NULL, NULL, NULL, NULL);
> > + if (test_conn)
> > + {
> > + PQfinish(test_conn);
> > +
> > + /* Use already created database for connection. */
> > + opts->createDB = 0;
> > + opts->cparams.dbname = db_cell->str;
> > + }
> > + else
> > + {
> > + /* we'll have to create it */
> > + opts->createDB = 1;
> > + opts->cparams.dbname = connected_db;
> > + }
>
> In released versions, "pg_restore --create" fails if the database exists, and
> pg_restore w/o --create fails unless the database exists. I think we should
> continue that pattern in this new feature. If not, pg_restore should document
> how it treats pg_dumpall-sourced dumps with the "create if not exists"
> semantics appearing here.
Yes, we can document this behavior also.
I am working on all these review comments and I will post a patch in
the coming days.
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | David G. Johnston | 2025-07-09 19:05:12 | Re: 回复:[Internet]Re: [PATCH] Prevent replacement of a function if it's used in an index expression and is not IMMUTABLE |
Previous Message | Tom Lane | 2025-07-09 18:40:32 | Re: Why our Valgrind reports suck |