|From:||Dilip Kumar <dilipbalaut(at)gmail(dot)com>|
|To:||Robert Haas <robertmhaas(at)gmail(dot)com>|
|Cc:||Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: Proposal : Parallel Merge Join|
|Views:||Raw Message | Whole Thread | Download mbox|
On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I'm not happy with the way this patch can just happen to latch on to a
> path that's not parallel-safe rather than one that is and then just
> give up on a merge join in that case. I already made this argument in
> and my opinion hasn't changed. I've subsequently come to realize that
> this problem is more widespread that I initially realized: not only do
> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up
> without searching for the cheapest parallel-safe path, but the latter
> later calls get_cheapest_path_for_pathkeys() and then just pretends it
> didn't find anything unless the result is parallel-safe. This makes
> no sense to me: the cheapest parallel-safe path could be 1% more
> expensive than the cheapest path of any kind, but because it's not the
> cheapest we arbitrarily skip it, even though parallelizing more of the
> tree might leave us way ahead overall.
for sort_inner_and_outer I have followed the same logic what
hash_inner_and_outer is doing. Also moved the logic of selecting the
cheapest_safe_inner out of the pathkey loop.
> I suggest that we add an additional "bool require_parallel_safe"
> argument to get_cheapest_path_for_pathkeys(); when false, the function
> will behave as at present, but when true, it will skip paths that are
> not parallel-safe. And similarly have a function to find the cheapest
> parallel-safe path if the first one in the list isn't it. See
> attached draft patch. Then this code can search for the correct path
> instead of searching using the wrong criteria and then giving up if it
> doesn't get the result it wants.
Done as suggested. Also, rebased the path_search_for_mergejoin on
head and updated the comments of get_cheapest_path_for_pathkeys for
> + if (!(joinrel->consider_parallel &&
> + save_jointype != JOIN_UNIQUE_OUTER &&
> + save_jointype != JOIN_FULL &&
> + save_jointype != JOIN_RIGHT &&
> + outerrel->partial_pathlist != NIL &&
> + bms_is_empty(joinrel->lateral_relids)))
> I would distribute the negation, so that this reads if
> (!joinrel->consider_parallel || save_jointype == JOIN_UNIQUE_OUTER ||
> save_jointype == JOIN_FULL || ...). Or maybe better yet, just drop
> the ! and the return that follows and put the
> consider_parallel_nestloop and consider_parallel_mergejoin calls
> inside the if-block.
> You could Assert(nestjoinOK) instead of testing it, although I guess
> it doesn't really matter.
left as it is.
|Next Message||Robert Haas||2017-03-06 13:17:59||Re: PATCH: Configurable file mode mask|
|Previous Message||Giuseppe Broccolo||2017-03-06 12:49:24||dump a comment of a TSDictionary|