Re: Parallel Append implementation

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-08-30 12:02:52
Message-ID: CAJ3gD9dv7oCsmusWP63VFp9pfkHZ6fYVPqSXPYz9cSEp90n5Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi Rafia,

On 17 August 2017 at 14:12, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> But for all of the cases here, partial
> subplans seem possible, and so even on HEAD it executed Partial
> Append. So between a Parallel Append having partial subplans and a
> Partial Append having partial subplans , the cost difference would not
> be significant. Even if we assume that Parallel Append was chosen
> because its cost turned out to be a bit cheaper, the actual
> performance gain seems quite large as compared to the expected cost
> difference. So it might be even possible that the performance gain
> might be due to some other reasons. I will investigate this, and the
> other queries.
>

I ran all the queries that were showing performance benefits in your
run. But for me, the ParallelAppend benefits are shown only for plans
that use Partition-Wise-Join.

For all the queries that use only PA plans but not PWJ plans, I got
the exact same plan for HEAD as for PA+PWJ patch, except that for the
later, the Append is a ParallelAppend. Whereas, for you, the plans
have join-order changed.

Regarding actual costs; consequtively, for me the actual-cost are more
or less the same for HEAD and PA+PWJ. Whereas, for your runs, you have
quite different costs naturally because the plans themselves are
different on head versus PA+PWJ.

My PA+PWJ plan outputs (and actual costs) match exactly what you get
with PA+PWJ patch. But like I said, I get the same join order and same
plans (and actual costs) for HEAD as well (except
ParallelAppend=>Append).

May be, if you have the latest HEAD code with your setup, you can
yourself check some of the queries again to see if they are still
seeing higher costs as compared to PA ? I suspect that some changes in
latest code might be causing this discrepancy; because when I tested
some of the explains with a HEAD-branch server running with your
database, I got results matching PA figures.

Attached is my explain-analyze outputs.

On 16 August 2017 at 18:34, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Thanks for the benchmarking results!
>
> On Tue, Aug 15, 2017 at 11:35 PM, Rafia Sabih
> <rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
>> Q4 | 244 | 12 | PA and PWJ, time by only PWJ - 41
>
> 12 seconds instead of 244? Whoa. I find it curious that we picked a
> Parallel Append with a bunch of non-partial plans when we could've
> just as easily picked partial plans, or so it seems to me. To put
> that another way, why did we end up with a bunch of Bitmap Heap Scans
> here instead of Parallel Bitmap Heap Scans?

Actually, the cost difference would be quite low for Parallel Append
with partial plans and Parallel Append with non-partial plans with 2
workers. But yes, I should take a look at why it is consistently
taking non-partial Bitmap Heap Scan.

----

> Q6 | 29 | 12 | PA only

This one needs to be analysed, because here, the plan cost is the
same, but actual cost for PA is almost half the cost for HEAD. This is
the same observation for my run also.

Thanks
-Amit

Attachment Content-Type Size
PA-test-AmitKh.tar.gz application/x-gzip 61.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-30 12:04:27 Re: Parallel worker error
Previous Message Alvaro Herrera 2017-08-30 12:02:10 Re: A design for amcheck heapam verification