Re: Parallel Full Hash Join

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Full Hash Join
Date: 2021-04-02 17:29:38
Message-ID: CAAKRu_bt-GCf0FmqmfB+fmDCmf7Kw=RQ38Fk1-kYNofi=jAHaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 5, 2021 at 8:31 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Tue, Mar 2, 2021 at 11:27 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > On Fri, Feb 12, 2021 at 11:02 AM Melanie Plageman
> > <melanieplageman(at)gmail(dot)com> wrote:
> > > I just attached the diff.
> >
> > Squashed into one patch for the cfbot to chew on, with a few minor
> > adjustments to a few comments.
>
> I did some more minor tidying of comments and naming. It's been on my
> to-do-list to update some phase names after commit 3048898e, and while
> doing that I couldn't resist the opportunity to change DONE to FREE,
> which somehow hurts my brain less, and makes much more obvious sense
> after the bugfix in CF #3031 that splits DONE into two separate
> phases. It also pairs obviously with ALLOCATE. I include a copy of
> that bugix here too as 0001, because I'll likely commit that first, so
> I rebased the stack of patches that way. 0002 includes the renaming I
> propose (master only). Then 0003 is Melanie's patch, using the name
> SCAN for the new match bit scan phase. I've attached an updated
> version of my "phase diagram" finger painting, to show how it looks
> with these three patches. "scan*" is new.

Feedback on
v6-0002-Improve-the-naming-of-Parallel-Hash-Join-phases.patch

I like renaming DONE to FREE and ALLOCATE TO REALLOCATE in the grow
barriers. FREE only makes sense for the Build barrier if you keep the
added PHJ_BUILD_RUN phase, though, I assume you would change this patch
if you decided not to add the new build barrier phase.

I like the addition of the asterisks to indicate a phase is executed by
a single arbitrary process. I was thinking, shall we add one of these to
HJ_FILL_INNER since it is only done by one process in parallel right and
full hash join? Maybe that's confusing because serial hash join uses
that state machine too, though. Maybe **? Maybe we should invent a
complicated symbolic language :)

One tiny, random, unimportant thing: The function prototype for
ExecParallelHashJoinPartitionOuter() calls its parameter "node" and, in
the definition, it is called "hjstate". This feels like a good patch to
throw in that tiny random change to make the name the same.

static void ExecParallelHashJoinPartitionOuter(HashJoinState *node);

static void
ExecParallelHashJoinPartitionOuter(HashJoinState *hjstate)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mike Palmiotto 2021-04-02 17:31:10 Re: Process initialization labyrinth
Previous Message Andres Freund 2021-04-02 17:27:23 Re: shared-memory based stats collector