Re: Parallel Append implementation

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-09-14 15:00:52
Message-ID: CAJ3gD9fJVHsfF-34YDWEpzjur8P0au+B4BnB3D+kr2wfz7xFNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11 September 2017 at 18:55, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Do you think non-parallel-aware Append
>>> will be better in any case when there is a parallel-aware append? I
>>> mean to say let's try to create non-parallel-aware append only when
>>> parallel-aware append is not possible.
>>
>> By non-parallel-aware append, I am assuming you meant partial
>> non-parallel-aware Append. Yes, if the parallel-aware Append path has
>> *all* partial subpaths chosen, then we do omit a partial non-parallel
>> Append path, as seen in this code in the patch :
>>
>> /*
>> * Consider non-parallel partial append path. But if the parallel append
>> * path is made out of all partial subpaths, don't create another partial
>> * path; we will keep only the parallel append path in that case.
>> */
>> if (partial_subpaths_valid && !pa_all_partial_subpaths)
>> {
>> ......
>> }
>>
>> But if the parallel-Append path has a mix of partial and non-partial
>> subpaths, then we can't really tell which of the two could be cheapest
>> until we calculate the cost. It can be that the non-parallel-aware
>> partial Append can be cheaper as well.
>>
>
> How? See, if you have four partial subpaths and two non-partial
> subpaths, then for parallel-aware append it considers all six paths in
> parallel path whereas for non-parallel-aware append it will consider
> just four paths and that too with sub-optimal strategy. Can you
> please try to give me some example so that it will be clear.

Suppose 4 appendrel children have costs for their cheapest partial (p)
and non-partial paths (np) as shown below :

p1=5000 np1=100
p2=200 np2=1000
p3=80 np3=2000
p4=3000 np4=50

Here, following two Append paths will be generated :

1. a parallel-aware Append path with subpaths :
np1, p2, p3, np4

2. Partial (i.e. non-parallel-aware) Append path with all partial subpaths:
p1,p2,p3,p4

Now, one thing we can do above is : Make the path#2 parallel-aware as
well; so both Append paths would be parallel-aware. Are you suggesting
exactly this ?

So above, what I am saying is, we can't tell which of the paths #1 and
#2 are cheaper until we calculate total cost. I didn't understand what
did you mean by "non-parallel-aware append will consider only the
partial subpaths and that too with sub-optimal strategy" in the above
example. I guess, you were considering a different scenario than the
above one.

Whereas, if one or more subpaths of Append do not have partial subpath
in the first place, then non-parallel-aware partial Append is out of
question, which we both agree.
And the other case where we skip non-parallel-aware partial Append is
when all the cheapest subpaths of the parallel-aware Append path are
partial paths: we do not want parallel-aware and non-parallel-aware
Append paths both having exactly the same partial subpaths.

---------

I will be addressing your other comments separately.

Thanks
-Amit Khandekar

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-09-14 15:01:06 Re: additional contrib test suites
Previous Message Pavel Stehule 2017-09-14 14:35:10 Re: psql: new help related to variables are not too readable