Re: Parallel INSERT (INTO ... SELECT ...)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-11 21:10:06
Message-ID: 20210311211006.GF8796@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 11, 2021 at 01:01:42PM +0000, houzj(dot)fnst(at)fujitsu(dot)com wrote:
> > I guess to have the finer granularity we'd have to go with enable_parallel_insert,
> > which then would mean possibly having to later add enable_parallel_update,
> > should parallel update have similar potential overhead in the parallel-safety
> > checks (which to me, looks like it could, and parallel delete may not ...)
> >
> > It's a shame there is no "set" type for GUC options.
> > e.g.
> > enable_parallel_dml='insert,update'
> > Maybe that's going too far.

Isn't that just GUC_LIST_INPUT ?
I'm not sure why it'd be going to far ?

The GUC-setting assign hook can parse the enable_parallel_dml_list value set by
the user, and set an internal int/bits enable_parallel_dml variable with some
define/enum values like:

GUC_PARALLEL_DML_INSERT 0x01
GUC_PARALLEL_DML_DELETE 0x02
GUC_PARALLEL_DML_UPDATE 0x04

The namespace.c assign hook is a good prototype for this. The parsed, integer
GUC can probably be a static variable in clauses.c.

Then, the planner can check if:
|commandType == CMD_INSERT &&
| (enable_parallel_dml & GUC_PARALLEL_DML_INSERT) != 0
[...]

+ this table. When enabled (and provided that
+ <xref linkend="guc-enable-parallel-insert"/> is also <literal>true</literal>),

It seems like this usefully allows the GUC to be enabled, and reloption to be
disabled. But if the GUC is disabled, then it's impossible to enable for a
single table. That seems unfortunate. I think part of the issue is the
naming. If the GUC is called "enable_*", then setting it to "off" should
disable it entirely, for consistency with other GUCs. So maybe it needs
another name, like parallel_dml='insert'. I think maybe "all" should be
an accepted value.

Note also this CF entry
https://commitfest.postgresql.org/32/2987/
| Allow setting parallel_workers on partitioned tables

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-03-11 21:24:05 Re: [HACKERS] Custom compression methods
Previous Message Robert Haas 2021-03-11 20:34:02 Re: Is Recovery actually paused?