RE: Parallel INSERT (INTO ... SELECT ...)

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: 'Amit Langote' <amitlangote09(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)cn(dot)fujitsu(dot)com" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: RE: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-10 08:03:40
Message-ID: TYAPR01MB29905F6B49E4578AE9000ECEFE8D9@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Amit Langote <amitlangote09(at)gmail(dot)com>
> On Wed, Feb 10, 2021 at 1:35 PM Greg Nancarrow <gregn4422(at)gmail(dot)com>
> wrote:
> > There's no "second-guessing" involved here.
> > There is no underlying way of dividing up the VALUES data of
> > "INSERT...VALUES" amongst the parallel workers, even if the planner
> > was updated to produce a parallel-plan for the "INSERT...VALUES" case
> > (apart from the fact that spawning off parallel workers to insert that
> > data would almost always result in worse performance than a
> > non-parallel plan...)
> > The division of work for parallel workers is part of the table AM
> > (scan) implementation, which is not invoked for "INSERT...VALUES".
>
> I don't disagree that the planner would not normally assign a parallel
> path simply to pull values out of a VALUES list mentioned in the
> INSERT command, but deciding something based on the certainty of it in
> an earlier planning phase seems odd to me. Maybe that's just me
> though.

In terms of competitiveness, Oracle does not run INSERT VALUES in parallel:

https://docs.oracle.com/en/database/oracle/oracle-database/21/vldbg/types-parallelism.html#GUID-6626C70C-876C-47A4-8C01-9B66574062D8

"The INSERT VALUES statement is never executed in parallel."

And SQL Server doesn't either:

https://docs.microsoft.com/en-us/sql/relational-databases/query-processing-architecture-guide?view=sql-server-ver15

"Starting with SQL Server 2016 (13.x) and database compatibility level 130, the INSERT … SELECT statement can be executed in parallel when inserting into heaps or clustered columnstore indexes (CCI), and using the TABLOCK hint."

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-02-10 08:04:04 Re: Preserve attstattarget on REINDEX CONCURRENTLY
Previous Message Heikki Linnakangas 2021-02-10 07:58:18 Re: 64-bit XIDs in deleted nbtree pages