RE: Parallel INSERT (INTO ... SELECT ...)

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: "houzj(dot)fnst(at)cn(dot)fujitsu(dot)com" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: RE: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-12 07:32:48
Message-ID: TYAPR01MB2990F3A44DBD4A5A15BDEC37FE8B9@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
> If we diable bitmapscan, the performance degradation seems will not happen.

Yes, but that's because the hundreds of times slower sequential scan hides the insert time. Furthermore, as an aside, Worker 3 does much of the work in the parallel sequential scan + parallel insert case, while the load is well balanced in the parallel bitmap scan + parallel insert case.

Oracle and SQL Server executes parallel DML by holding an exclusive lock on the target table. They might use some special path for parallel DML to mitigate contention.

[serial bitmap scan + serial insert]
-> Bitmap Heap Scan on public.x (cost=3272.20..3652841.26 rows=79918 width=8) (actual time=8.096..41.005 rows=129999 loops=1)
...
Execution Time: 360.547 ms

[parallel bitmap scan + parallel insert]
-> Parallel Bitmap Heap Scan on public.x (cost=3272.20..1260119.35 rows=19980 width=8) (actual time=5.832..14.787 rows=26000 loops=5)
...
Execution Time: 382.776 ms

[serial sequential scan + serial insert]
-> Seq Scan on public.x (cost=0.00..5081085.52 rows=81338 width=8) (actual time=0.010..18997.317 rows=129999 loops=1)
...
Execution Time: 19311.700 ms

[parallel sequential scan + parallel insert]
-> Parallel Seq Scan on public.x (cost=0.00..2081082.88 rows=20334 width=8) (actual time=4001.641..5287.248 rows=32500 loops=4)
...
Execution Time: 5488.493 ms

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2021-02-12 07:43:00 RE: [POC] Fast COPY FROM command for the table with foreign partitions
Previous Message osumi.takamichi@fujitsu.com 2021-02-12 06:59:28 RE: [HACKERS] logical decoding of two-phase transactions