RE: pg_upgrade: Pass -j down to vacuumdb

From: "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>
To: "'jesper(dot)pedersen(at)redhat(dot)com'" <jesper(dot)pedersen(at)redhat(dot)com>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, "fabriziomello(at)gmail(dot)com" <fabriziomello(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: RE: pg_upgrade: Pass -j down to vacuumdb
Date: 2019-02-04 02:34:05
Message-ID: D09B13F772D2274BB348A310EE3027C642DA10@g01jpexmbkw24
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers


On February 1, 2019 8:14 PM +0000, Jesper Pedersen wrote:

> On 2/1/19 4:58 AM, Alvaro Herrera wrote:
>> On 2019-Feb-01, Jamison, Kirk wrote:
>>> I'm not sure if misunderstood the purpose of $VACUUMDB_OPTS. I
>>> thought what the other developers suggested is to utilize it so that
>>> --jobs for vacuumdb would be optional and just based from user-specified option $VACUUMDB_OPTS.
>>> By which it would not utilize the amount of jobs used for pg_upgrade.
>>> To address your need of needing a parallel, the script would just
>>> then suggest if the user wants a --jobs option. The if/else for
>>> number of jobs is not needed in that case, maybe only for ifndef WIN32 else case.
>> No Kirk, you got it right -- (some of) those ifdefs are not needed, as
>> adding the --jobs in the command line is not needed. I do think some
>> extra whitespace in the format string is needed.

> The point of the patch is to highlight to the user that even though he/she does "pg_upgrade -j 8" the "-j 8" option isn't passed down to vacuumdb.
> Default value is 1, so in that case the echo command doesn't highlight that fact, otherwise it is. The user can then cancel the script and use the suggested command line.
> The script then executes the vaccumdb command with the $VACUUMDB_OPTS argument as noted in the documentation.
> Sample script attached as well.

Sorry I think I might have understood now where you are coming from,
where your script clarifies that it's not necessarily passed down.
If committers can let it pass, maybe it's necessary to highlight in the script,
Something like:
"If you would like default statistics as quickly as possible (e.g. run in parallel)..."

The difference is still perhaps your use of --jobs option
as somehow "explicitly" embedded in the code, compared to the suggestion of
making it optional by using the $VACUUMDB_OPTS variable, so that
jobs need not to be explicitly written, unless the user wants to.
(which I also think it's a better improvement because it's optional)

Some quick code check. sorry, don't have much time to write a patch for it today.
Perhaps you could write it similar to what you did in the --analyze-in-stages part.
Maybe something like (not sure if correct):
+ fprintf(script, "echo %s \"%s/vacuumdb\" %s $VACUUMDB_OPTS --all %s%s\n", ECHO_QUOTE,
+ new_cluster.bindir,,
+ /* Did we copy the free space files? */
+ (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+ "--analyze-only" : "--analyze", ECHO_QUOTE);

I'll continue to review though from time to time.

Kirk Jamison

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-02-04 02:34:25 Re: Using POPCNT and other advanced bit manipulation instructions
Previous Message Michael Paquier 2019-02-04 02:29:34 Re: log bind parameter values on error