Re: why not parallel seq scan for slow functions

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why not parallel seq scan for slow functions
Date: 2017-07-12 17:50:21
Message-ID: CAMkU=1wDRf7ayn0c+j5GR57UfKsMkikn1fhgQNwQnr+ApEXHcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 11, 2017 at 10:25 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Wed, Jul 12, 2017 at 1:50 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> > On Mon, Jul 10, 2017 at 9:51 PM, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
> wrote:
> >>
> >> So because of this high projection cost the seqpath and parallel path
> >> both have fuzzily same cost but seqpath is winning because it's
> >> parallel safe.
> >
> >
> > I think you are correct. However, unless parallel_tuple_cost is set very
> > low, apply_projection_to_path never gets called with the Gather path as
> an
> > argument. It gets ruled out at some earlier stage, presumably because it
> > assumes the projection step cannot make it win if it is already behind by
> > enough.
> >
>
> I think that is genuine because tuple communication cost is very high.
>

Sorry, I don't know which you think is genuine, the early pruning or my
complaint about the early pruning.

I agree that the communication cost is high, which is why I don't want to
have to set parellel_tuple_cost very low. For example, to get the benefit
of your patch, I have to set parellel_tuple_cost to 0.0049 or less (in my
real-world case, not the dummy test case I posted, although the number are
around the same for that one too). But with a setting that low, all kinds
of other things also start using parallel plans, even if they don't benefit
from them and are harmed.

I realize we need to do some aggressive pruning to avoid an exponential
explosion in planning time, but in this case it has some rather unfortunate
consequences. I wanted to explore it, but I can't figure out where this
particular pruning is taking place.

By the time we get to planner.c line 1787, current_rel->pathlist already
does not contain the parallel plan if parellel_tuple_cost >= 0.0050, so the
pruning is happening earlier than that.

> If your table is reasonable large then you might want to try by
> increasing parallel workers (Alter Table ... Set (parallel_workers =
> ..))
>

Setting parallel_workers to 8 changes the threshold for the parallel to
even be considered from parellel_tuple_cost <= 0.0049 to <= 0.0076. So it
is going in the correct direction, but not by enough to matter.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2017-07-12 18:31:47 Re: pl/perl extension fails on Windows
Previous Message Claudio Freire 2017-07-12 16:29:21 Re: Fwd: Vacuum: allow usage of more than 1GB of work mem