pgsql: Enable parallel SELECT for "INSERT INTO ... SELECT ...".

From: Amit Kapila <akapila(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Enable parallel SELECT for "INSERT INTO ... SELECT ...".
Date: 2021-03-10 02:17:54
Message-ID: E1lJoQ6-0005BJ-DY@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Enable parallel SELECT for "INSERT INTO ... SELECT ...".

Parallel SELECT can't be utilized for INSERT in the following cases:
- INSERT statement uses the ON CONFLICT DO UPDATE clause
- Target table has a parallel-unsafe: trigger, index expression or
predicate, column default expression or check constraint
- Target table has a parallel-unsafe domain constraint on any column
- Target table is a partitioned table with a parallel-unsafe partition key
expression or support function

The planner is updated to perform additional parallel-safety checks for
the cases listed above, for determining whether it is safe to run INSERT
in parallel-mode with an underlying parallel SELECT. The planner will
consider using parallel SELECT for "INSERT INTO ... SELECT ...", provided
nothing unsafe is found from the additional parallel-safety checks, or
from the existing parallel-safety checks for SELECT.

While checking parallel-safety, we need to check it for all the partitions
on the table which can be costly especially when we decide not to use a
parallel plan. So, in a separate patch, we will introduce a GUC and or a
reloption to enable/disable parallelism for Insert statements.

Prior to entering parallel-mode for the execution of INSERT with parallel
SELECT, a TransactionId is acquired and assigned to the current
transaction state. This is necessary to prevent the INSERT from attempting
to assign the TransactionId whilst in parallel-mode, which is not allowed.
This approach has a disadvantage in that if the underlying SELECT does not
return any rows, then the TransactionId is not used, however that
shouldn't happen in practice in many cases.

Author: Greg Nancarrow, Amit Langote, Amit Kapila
Reviewed-by: Amit Langote, Hou Zhijie, Takayuki Tsunakawa, Antonin Houska, Bharath Rupireddy, Dilip Kumar, Vignesh C, Zhihong Yu, Amit Kapila
Tested-by: Tang, Haiying
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/05c8482f7f69a954fd65fce85f896e848fc48197

Modified Files
--------------
doc/src/sgml/parallel.sgml | 4 +-
src/backend/access/transam/xact.c | 26 ++
src/backend/executor/execMain.c | 3 +
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/outfuncs.c | 2 +
src/backend/nodes/readfuncs.c | 1 +
src/backend/optimizer/plan/planner.c | 37 +-
src/backend/optimizer/util/clauses.c | 550 +++++++++++++++++++++++++-
src/backend/utils/cache/plancache.c | 33 +-
src/include/access/xact.h | 15 +
src/include/nodes/pathnodes.h | 2 +
src/include/nodes/plannodes.h | 2 +
src/include/optimizer/clauses.h | 3 +-
src/test/regress/expected/insert_parallel.out | 536 +++++++++++++++++++++++++
src/test/regress/parallel_schedule | 1 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/insert_parallel.sql | 335 ++++++++++++++++
17 files changed, 1531 insertions(+), 21 deletions(-)

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2021-03-10 03:37:35 Re: pgsql: Enable parallel SELECT for "INSERT INTO ... SELECT ...".
Previous Message Michael Paquier 2021-03-10 01:05:09 Re: pgsql: Remove support for SSL compression

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-03-10 02:24:58 RE: Avoid CommandCounterIncrement in RI trigger when INSERT INTO referencing table
Previous Message Tomas Vondra 2021-03-10 02:00:25 Re: Columns correlation and adaptive query optimization