Re: parallel joins, and better parallel explain

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel joins, and better parallel explain
Date: 2015-12-17 05:33:26
Message-ID: CAA4eK1LDnmGbjrkijCCR-w1x20MNGwZ2WyQviK-XSRDuPonceQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 16, 2015 at 9:55 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

> On Wed, Dec 16, 2015 at 6:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
>
> >On Tue, Dec 15, 2015 at 7:31 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> >>
> >> On Mon, Dec 14, 2015 at 8:38 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
>
> > In any case,
> >I have done some more investigation of the patch and found that even
> >without changing query planner related parameters, it seems to give
> >bad plans (as in example below [1]). I think here the costing of rework
> each
>
> I have done some more testing using TPC-H benchmark (For some of the
> queries, specially for Parallel Hash Join), and Results summary is as below.
>
>
> *Planning Time(ms)*
> *Query* *Base* *Patch* TPC-H Q2 2.2 2.4 TPCH- Q3 0.67 0.71 TPCH- Q5 3.17
> 2.3 TPCH- Q7 2.43 2.4
>
>
>
> *Execution Time(ms)*
> *Query* *Base* *Patch* TPC-H Q2 2826 766 TPCH- Q3 23473 24271 TPCH- Q5
> 21357 1432 TPCH- Q7 6779 1138
> All Test files and Detail plan output is attached in mail
> q2.sql, q3.sql, q.5.sql ans q7.sql are TPCH benchmark' 2nd, 3rd, 5th and
> 7th query
> and Results with base and Parallel join are attached in q*_base.out and
> q*_parallel.out respectively.
>
> Summary: With TPC-H queries where ever Hash Join is pushed under gather
> Node, significant improvement is visible,
> with Q2, using 3 workers, time consumed is almost 1/3 of the base.
>
>
> I Observed one problem, with Q5 and Q7, there some relation and snapshot
> references are leaked and i am getting below warning, havn't yet looked
> into the issue.
>
>
While looking at plans of Q5 and Q7, I have observed that Gather is
pushed below another Gather node for which we don't have appropriate
way of dealing. I think that could be the reason why you are seeing
the errors.

Also, I think it would be good if you can once check the plan/execution
time with max_parallel_degree=0 as that can give us base reference
data without parallelism, also I am wondering if have you have changed
any other parallel cost related parameter?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-12-17 06:06:33 Re: Making tab-complete.c easier to maintain
Previous Message Michael Paquier 2015-12-17 04:59:20 Re: extend pgbench expressions with functions