RE: Parallel INSERT (INTO ... SELECT ...)

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: "houzj(dot)fnst(at)cn(dot)fujitsu(dot)com" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: RE: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-10 01:11:43
Message-ID: TYAPR01MB2990825902E70EC37D945479FE8D9@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
> Till now, what I found is that:
> With tang's conf, when doing parallel insert, the walrecord is more than serial
> insert (IMO, this is the main reason why it has performance degradation) See
> the attatchment for the plan info.
>
> I have tried alter the target table to unlogged and then the performance
> degradation will not happen any more.
>
> And the additional walrecord seems related to the index on the target table.
> If the target table does not have any index, the wal record is the same between
> parallel plan and serial plan.
> Also, it does not have performance degradation without index.

[serial]
Insert on public.testscan (cost=3272.20..3652841.26 rows=0 width=0) (actual time=360.474..360.476 rows=0 loops=1)
Buffers: shared hit=392569 read=3 dirtied=934 written=933
WAL: records=260354 bytes=16259841

[parallel]
-> Insert on public.testscan (cost=3272.20..1260119.35 rows=0 width=0) (actual time=378.227..378.229 rows=0 loops=5)
Buffers: shared hit=407094 read=4 dirtied=1085 written=1158
WAL: records=260498 bytes=17019359

More pages are dirtied and written in the parallel execution. Aren't the index and possibly the target table bigger with parallel execution than with serial execution? That may be due to the difference of inserts of index keys.

Regards
Takayuki Tsunakawa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-02-10 01:43:35 Re: adding wait_start column to pg_locks
Previous Message Euler Taveira 2021-02-10 01:10:32 Re: Clean up code