Re: [PATCH] Resolve Parallel Hash Join Performance Issue

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Deng, Gang" <gang(dot)deng(at)intel(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Resolve Parallel Hash Join Performance Issue
Date: 2020-01-21 05:20:11
Message-ID: CA+hUKGKtumr13pf0OSOeUca9YZqFf-1xg-5wgt+YBKZGrz=xiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(Replies to both Gang and Tom below).

On Fri, Jan 10, 2020 at 1:52 PM Deng, Gang <gang(dot)deng(at)intel(dot)com> wrote:
> Thank you for the comment. Yes, I agree the alternative of using '(!parallel)', so that no need to test the bit. Will someone submit patch to for it accordingly?

Here's a patch like that.

On Fri, Jan 10, 2020 at 3:43 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > Right, I see. The funny thing is that the match bit is not even used
> > in this query (it's used for right and full hash join, and those
> > aren't supported for parallel joins yet). Hmm. So, instead of the
> > test you proposed, an alternative would be to use if (!parallel).
> > That's a value that will be constant-folded, so that there will be no
> > branch in the generated code (see the pg_attribute_always_inline
> > trick). If, in a future release, we need the match bit for parallel
> > hash join because we add parallel right/full hash join support, we
> > could do it the way you showed, but only if it's one of those join
> > types, using another constant parameter.
>
> Can we base the test off the match type today, and avoid leaving
> something that will need to be fixed later?

I agree that it'd be nicer to use the logically correct thing, namely
a test of HJ_FILL_INNER(node), but that'd be a run-time check. I'd
like to back-patch this and figured that we don't want to add new
branches too casually.

I have an experimental patch where "fill_inner" and "fill_outer" are
compile-time constants and you can skip various bits of code without
branching (part of a larger experiment to figure out which of many
parameters are worth specialising at a cost of a couple of KB of text
per combination, including the ability to use wider hashes so that
monster sized joins work better). Then I could test the logically
correct thing explicitly without branches.

Attachment Content-Type Size
0001-Avoid-unnecessary-shmem-writes-in-Parallel-Hash-Join.patch application/octet-stream 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-01-21 05:21:35 Re: Error message inconsistency
Previous Message Amit Kapila 2020-01-21 05:19:46 Re: Error message inconsistency