RE: Parallel INSERT (INTO ... SELECT ...)

From: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: RE: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-09 12:18:49
Message-ID: 95066de635ce41eca7d06910c9f663e7@G08CNEXMBPEKD05.g08.fujitsu.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > > postgres=# explain verbose insert into testscan select a from x
> > > where
> > > a<80000 or (a%2=0 and a>199900000);
> > > QUERY PLAN
> > >
> --------------------------------------------------------------------
> > > --
> > > -----------------------------
> > > Gather (cost=4346.89..1281204.64 rows=81372 width=0)
> > > Workers Planned: 4
> > > -> Insert on public.testscan (cost=3346.89..1272067.44 rows=0
> > > width=0)
> > > -> Parallel Bitmap Heap Scan on public.x1
> > > (cost=3346.89..1272067.44 rows=20343 width=8)
> > > Output: x1.a, NULL::integer
> > > Recheck Cond: ((x1.a < 80000) OR (x1.a > 199900000))
> > > Filter: ((x1.a < 80000) OR (((x1.a % 2) = 0) AND
> > > (x1.a >
> > > 199900000)))
> > > -> BitmapOr (cost=3346.89..3346.89 rows=178808
> > > width=0)
> > > -> Bitmap Index Scan on x1_a_idx
> > > (cost=0.00..1495.19 rows=80883 width=0)
> > > Index Cond: (x1.a < 80000)
> > > -> Bitmap Index Scan on x1_a_idx
> > > (cost=0.00..1811.01 rows=97925 width=0)
> > > Index Cond: (x1.a > 199900000)
> > >
> > > PSA is my postgresql.conf file, maybe you can have a look. Besides,
> > > I didn't do any parameters tuning in my test session.
> >
> > I reproduced this on my machine.
> >
> > I think we'd better do "analyze" before insert which helps reproduce this
> easier.
> > Like:
> >
> > -----
> > analyze;
> > explain analyze verbose insert into testscan select a from x where
> > a<80000 or (a%2=0 and a>199900000);
> > -----
>
> OK then.
> Can you check if just the underlying SELECTs are run (without INSERT), is
> there any performance degradation when compared to a non-parallel scan?

It seems there is no performance degradation without insert.

Till now, what I found is that:
With tang's conf, when doing parallel insert, the walrecord is more than serial insert
(IMO, this is the main reason why it has performance degradation)
See the attatchment for the plan info.

I have tried alter the target table to unlogged and
then the performance degradation will not happen any more.

And the additional walrecord seems related to the index on the target table.
If the target table does not have any index, the wal record is the same between parallel plan and serial plan.
Also, it does not have performance degradation without index.

I am still looking at this problem, if someone think of something about it,
It's very grateful to share the knowledge with me.

Best regards,
houzj

Attachment Content-Type Size
serial_insert.txt text/plain 1.7 KB
parallel_insert.txt text/plain 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-02-09 12:19:01 Re: TRUNCATE on foreign table
Previous Message Ashutosh Bapat 2021-02-09 12:01:32 Re: TRUNCATE on foreign table