From: | "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com> |
---|---|
To: | Greg Nancarrow <gregn4422(at)gmail(dot)com>, 'Amit Kapila' <amit(dot)kapila16(at)gmail(dot)com>, "tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com> |
Cc: | Dilip Kumar <dilipbalaut(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Antonin Houska <ah(at)cybertec(dot)at> |
Subject: | RE: Parallel INSERT (INTO ... SELECT ...) |
Date: | 2021-02-04 00:56:49 |
Message-ID: | TYAPR01MB2990DA4BCAEB495F999E0C36FEB39@TYAPR01MB2990.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> On Mon, Jan 18, 2021 at 2:40 PM Tang, Haiying
> <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com> wrote:
> > Execute EXPLAIN on Patched:
> > postgres=# explain (ANALYZE, BUFFERS, VERBOSE) insert into test_part
> select * from test_data1;
> > QUERY PLAN
> >
> ---------------------------------------------------------------------------
> ---------------------------------------------
> > Insert on public.test_part (cost=0.00..15.00 rows=0 width=0) (actual
> time=44.139..44.140 rows=0 loops=1)
> > Buffers: shared hit=1005 read=1000 dirtied=3000 written=2000
> > -> Seq Scan on public.test_data1 (cost=0.00..15.00 rows=1000
> width=8) (actual time=0.007..0.201 rows=1000 loops=1)
> > Output: test_data1.a, test_data1.b
> > Buffers: shared hit=5
> > Planning:
> > Buffers: shared hit=27011
> > Planning Time: 24.526 ms
> > Execution Time: 44.981 ms
> >
> > Execute EXPLAIN on non-Patched:
> > postgres=# explain (ANALYZE, BUFFERS, VERBOSE) insert into test_part
> select * from test_data1;
> > QUERY PLAN
> >
> ---------------------------------------------------------------------------
> ---------------------------------------------
> > Insert on public.test_part (cost=0.00..15.00 rows=0 width=0) (actual
> time=72.656..72.657 rows=0 loops=1)
> > Buffers: shared hit=22075 read=1000 dirtied=3000 written=2000
> > -> Seq Scan on public.test_data1 (cost=0.00..15.00 rows=1000
> width=8) (actual time=0.010..0.175 rows=1000 loops=1)
> > Output: test_data1.a, test_data1.b
> > Buffers: shared hit=5
> > Planning:
> > Buffers: shared hit=72
> > Planning Time: 0.135 ms
> > Execution Time: 79.058 ms
> >
>
> So, the results indicate that after the patch we touch more buffers
> during planning which I think is because of accessing the partition
> information, and during execution, the patch touches fewer buffers for
> the same reason. But why this can reduce the time with patch? I think
> this needs some investigation.
I guess another factor other than shared buffers is relcache and catcache. The patched version loads those cached entries for all partitions of the insert target table during the parallel-safety check in planning, while the unpatched version has to gradually build those cache entries during execution. How can wee confirm its effect?
Regards
Takayuki Tsunakawa
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2021-02-04 00:59:03 | Re: WIP: BRIN multi-range indexes |
Previous Message | Zhihong Yu | 2021-02-04 00:49:34 | Re: WIP: BRIN multi-range indexes |