Quick Links

Re: why not parallel seq scan for slow functions

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: why not parallel seq scan for slow functions
Date:	2017-08-12 13:18:51
Message-ID:	CAA4eK1+X89Qk8k3Q9feiOyy5rvbiMjsS55e6pDLA0zYRU+ACMg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Aug 10, 2017 at 1:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Aug 8, 2017 at 3:50 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Right.
>>
>> I see two ways to include the cost of the target list for parallel
>> paths before rejecting them (a) Don't reject parallel paths
>> (Gather/GatherMerge) during add_path. This has the danger of path
>> explosion. (b) In the case of parallel paths, somehow try to identify
>> that path has a costly target list (maybe just check if the target
>> list has anything other than vars) and use it as a heuristic to decide
>> that whether a parallel path can be retained.
>
> I think the right approach to this problem is to get the cost of the
> GatherPath correct when it's initially created. The proposed patch
> changes the cost after-the-fact, but that (1) doesn't prevent a
> promising path from being rejected before we reach this point and (2)
> is probably unsafe, because it might confuse code that reaches the
> modified-in-place path through some other pointer (e.g. code which
> expects the RelOptInfo's paths to still be sorted by cost). Perhaps
> the way to do that is to skip generate_gather_paths() for the toplevel
> scan/join node and do something similar later, after we know what
> target list we want.
>

I think skipping a generation of gather paths for scan node or top
level join node generated via standard_join_search seems straight
forward, but skipping for paths generated via geqo seems to be tricky
(See use of generate_gather_paths in merge_clump). Assuming, we find
some way to skip it for top level scan/join node, I don't think that
will be sufficient, we have some special way to push target list below
Gather node in apply_projection_to_path, we need to move that part as
well in generate_gather_paths.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: why not parallel seq scan for slow functions at 2017-08-09 19:37:37 from Robert Haas

Responses

Re: why not parallel seq scan for slow functions at 2017-08-15 13:45:42 from Robert Haas
Re: why not parallel seq scan for slow functions at 2017-08-17 08:39:23 from Dilip Kumar

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2017-08-12 15:31:35	Re: pg_stat_statements query normalization, and the 'in' operator
Previous Message	Michael Paquier	2017-08-12 11:46:38	Regressions failures with libxml2 on ArchLinux