Re: Parallel Append implementation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Append implementation
Date: 2017-09-30 04:20:24
Message-ID: CAA4eK1LpZe34wzYstihz=mjK0kdOJHYh+nWgERX21wGK=4saZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 30, 2017 at 4:02 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Sep 29, 2017 at 7:48 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> I think in general the non-partial paths should be cheaper as compared
>> to partial paths as that is the reason planner choose not to make a
>> partial plan at first place. I think the idea patch is using will help
>> because the leader will choose to execute partial path in most cases
>> (when there is a mix of partial and non-partial paths) and for that
>> case, the leader is not bound to complete the execution of that path.
>> However, if all the paths are non-partial, then I am not sure much
>> difference it would be to choose one path over other.
>
> The case where all plans are non-partial is the case where it matters
> most! If the leader is going to take a share of the work, we want it
> to take the smallest share possible.
>

Okay, but the point is whether it will make any difference
practically. Let us try to see with an example, consider there are
two children (just taking two for simplicity, we can extend it to
many) and first having 1000 pages to scan and second having 900 pages
to scan, then it might not make much difference which child plan
leader chooses. Now, it might matter if the first child relation has
1000 pages to scan and second has just 1 page to scan, but not sure
how much difference will it be in practice considering that is almost
the maximum possible theoretical difference between two non-partial
paths (if we have pages greater than 1024 pages
(min_parallel_table_scan_size) then it will have a partial path).

> It's a lot fuzzier what is best when there are only partial plans.
>

The point that bothers me a bit is whether it is a clear win if we
allow the leader to choose a different strategy to pick the paths or
is this just our theoretical assumption. Basically, I think the patch
will become simpler if pick some simple strategy to choose paths.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-09-30 04:28:52 Re: SQL/JSON in PostgreSQL
Previous Message Robert Haas 2017-09-29 23:42:02 64-bit queryId?