Re: why not parallel seq scan for slow functions

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: why not parallel seq scan for slow functions
Date: 2017-09-05 08:34:31
Message-ID: CAA4eK1Ji_0pgrjFotAyvvfxGikxJQEKcxD863VQ-xYtfQBy0uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 25, 2017 at 10:08 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Aug 21, 2017 at 5:08 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> (b) I have changed the costing of gather path for path target in
>> generate_gather_paths which I am not sure is the best way. Another
>> possibility could have been that I change the code in
>> apply_projection_to_path as done in the previous patch and just call
>> it from generate_gather_paths. I have not done that because of your
>> comment above thread ("is probably unsafe, because it might confuse
>> code that reaches the modified-in-place path through some other
>> pointer (e.g. code which expects the RelOptInfo's paths to still be
>> sorted by cost)."). It is not clear to me what exactly is bothering
>> you if we directly change costing in apply_projection_to_path.
>
> The point I was trying to make is that if you retroactively change the
> cost of a path after you've already done add_path(), it's too late to
> change your mind about whether to keep the path. At least according
> to my current understanding, that's the root of this problem in the
> first place. On top of that, add_path() and other routines that deal
> with RelOptInfo path lists expect surviving paths to be ordered by
> descending cost; if you frob the cost, they might not be properly
> ordered any more.
>

Okay, now I understand your point, but I think we already change the
cost of paths in apply_projection_to_path which is done after add_path
for top level scan/join paths. I think this matters a lot in case of
Gather because the cost of computing target list can be divided among
workers. I have changed the patch such that parallel paths for top
level scan/join node will be generated after pathtarget is ready. I
had kept the costing of path targets local to
apply_projection_to_path() as that makes the patch simple.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel_paths_include_tlist_cost_v2.patch application/octet-stream 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2017-09-05 08:34:43 Re: Refactoring identifier checks to consistently use strcmp
Previous Message Sandro Santilli 2017-09-05 08:22:23 Re: pg_upgrade changes can it use CREATE EXTENSION?