Re: Avoid parallel full and right join paths.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: Avoid parallel full and right join paths.
Date: 2016-04-20 21:49:46
Message-ID: CA+Tgmoa1piMF26OvOanVpWwu3+8h2LDMMM-MGSd9dAG1E83poQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 19, 2016 at 10:21 AM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:
> Tests:
> create table mytab(x int,x1 char(9),x2 varchar(9));
> create table mytab1(y int,y1 char(9),y2 varchar(9));
> insert into mytab values (generate_series(1,50000),'aa','aaa');
> insert into mytab1 values (generate_series(1,10000),'aa','aaa');
> insert into mytab values (generate_series(1,500000),'aa','aaa');
> insert into mytab values (generate_series(1,500000),'aa','aaa');
> analyze mytab;
> analyze mytab1;
> vacuum mytab;
> vacuum mytab1;
>
> set max_parallel_degree=0;
> SET
> df=# SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;
> count
> -------
> 30000
> (1 row)
>
> # set max_parallel_degree=5;
> SET
> df=# SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;
> count
> -------
> 39089
> (1 row)
>
> Casue:
> ======
> Normal plan
> ==========
> explain SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;postgres-#
> QUERY PLAN
> ------------------------------------------------------------------------------
> Aggregate (cost=21682.71..21682.72 rows=1 width=8)
> -> Hash Right Join (cost=289.00..21629.07 rows=21457 width=0)
> Hash Cond: (mytab.x = mytab1.y)
> -> Seq Scan on mytab (cost=0.00..17188.00 rows=1050000 width=4)
> -> Hash (cost=164.00..164.00 rows=10000 width=4)
> -> Seq Scan on mytab1 (cost=0.00..164.00 rows=10000 width=4)
> =================================================================
>
> Parallel plan.
> ==========
> explain SELECT count(*) FROM mytab RIGHT OUTER JOIN mytab1
> ON mytab.x = mytab1.y;postgres-#
> QUERY PLAN
> -----------------------------------------------------------------------------------------------
> Finalize Aggregate (cost=14135.88..14135.89 rows=1 width=8)
> -> Gather (cost=14135.67..14135.88 rows=2 width=8)
> Number of Workers: 2
> -> Partial Aggregate (cost=13135.67..13135.68 rows=1 width=8)
> -> Hash Right Join (cost=289.00..13082.02 rows=21457 width=0)
> Hash Cond: (mytab.x = mytab1.y)
> -> Parallel Seq Scan on mytab (cost=0.00..11063.00 rows=437500 width=4)
> -> Hash (cost=164.00..164.00 rows=10000 width=4)
> -> Seq Scan on mytab1 (cost=0.00..164.00 rows=10000 width=4)
>
>
> As above Right and Full join paths cannot be parallel as they can produce
> false null extended rows because outer table is partial path and not
> completely visible.
> Adding a patch to fix same.

Committed. But I think the regression test needs more thought, so I
left that out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2016-04-20 21:51:35 Typos/Questions in bloom documentation
Previous Message Tom Lane 2016-04-20 21:16:43 Re: Proposal: Remove regress-python3-mangle.mk