Re: Parallel seq. plan is not coming against inheritance or partition table

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
Subject: Re: Parallel seq. plan is not coming against inheritance or partition table
Date: 2017-03-07 03:28:16
Message-ID: CAA4eK1+WqVK-i-+QXQPdkFquKmLcMuFQipv=fdwDHkGncKY7zA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 7, 2017 at 3:24 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Mar 5, 2017 at 9:41 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> RCA:
>>> ====
>>> From "Replace min_parallel_relation_size with two new GUCs" commit
>>> onwards, we are not assigning parallel workers for the child rel with
>>> zero heap pages. This means we won't be able to create a partial
>>> append path as this requires all the child rels within an Append Node
>>> to have a partial path.
>>
>> Right, but OTOH, if we assign parallel workers by default, then it is
>> quite possible that it would result in much worse plans. Consider a
>> case where partition hierarchy has 1000 partitions and only one of
>> them is big enough to allow parallel workers. Now in this case, with
>> your proposed fix it will try to scan all the partitions in parallel
>> workers which I think can easily result in bad performance. I think
>> the right way to make such plans parallel is by using Parallel Append
>> node (https://commitfest.postgresql.org/13/987/). Alternatively, if
>> you want to force parallelism in cases like the one you have shown in
>> example, you can use Alter Table .. Set (parallel_workers = 1).
>
> Well, I think this is a bug in
> 51ee6f3160d2e1515ed6197594bda67eb99dc2cc. The commit message doesn't
> say anything about changing any behavior, just about renaming GUCs,
> and I don't remember any discussion about changing the behavior
> either, and the comment still implies that we have the old behavior
> when we really don't, and parallel append isn't committed yet.
>

I also think that commit didn't intend to change the behavior,
however, the point is how sensible is it to keep such behavior after
Parallel Append. I am not sure if this is the right time to consider
it or shall we wait till Parallel Append is committed.

> I think the problem here is that compute_parallel_worker() thinks it
> can use 0 as a sentinel value that means "ignore this", but it can't
> really, because a heap can actually contain 0 pages. Here's a patch
> which does the following:
>

- if (heap_pages < (BlockNumber) min_parallel_table_scan_size &&
- index_pages < (BlockNumber) min_parallel_index_scan_size &&
- rel->reloptkind == RELOPT_BASEREL)
+ if (rel->reloptkind == RELOPT_BASEREL &&
+ ((heap_pages >= 0 && heap_pages < min_parallel_table_scan_size) ||
+ (index_pages >= 0 && index_pages < min_parallel_index_scan_size)))
return 0;

The purpose of considering both heap and index pages () for
min_parallel_* is that even if one of them is bigger than threshold
then we should try parallelism. This could be helpful for parallel
index scans where we consider parallel workers based on both heap and
index pages. Is there a reason for you to change that condition as
that is not even related to the problem being discussed?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-03-07 04:03:58 Re: Foreign Join pushdowns not working properly for outer joins
Previous Message Andres Freund 2017-03-07 03:01:37 Re: WIP: Faster Expression Processing v4