Re: Avoiding hash join batch explosions with extreme skew and weird stats

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jesse Zhang <sbjesse(at)gmail(dot)com>, dkimura(at)pivotal(dot)io
Subject: Re: Avoiding hash join batch explosions with extreme skew and weird stats
Date: 2020-04-30 14:30:35
Message-ID: CAAKRu_am1yFRKWKq5_zPQHDnfaDHQPM8xmqVmvta2HSHqjpD3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 28, 2020 at 11:50 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> On 29/04/2020 05:03, Melanie Plageman wrote:
> > I've attached a patch which should address some of the previous feedback
> > about code complexity. Two of my co-workers and I wrote what is
> > essentially a new prototype of the idea. It uses the main state machine
> > to route emitting unmatched tuples instead of introducing a separate
> > state. The logic for falling back is also more developed.
>
> I haven't looked at the patch in detail, but thanks for the commit
> message; it describes very well what this is all about. It would be nice
> to copy that explanation to the top comment in nodeHashJoin.c in some
> form. I think we're missing a high level explanation of how the batching
> works even before this new patch, and that commit message does a good
> job at it.
>
>
Thanks for taking a look, Heikki!

I made a few edits to the message and threw it into a draft patch (on
top of master, of course). I didn't want to junk up peoples' inboxes, so
I didn't start a separate thread, but, it will be pretty hard to
collaboratively edit the comment/ever register it for a commitfest if it
is wedged into this thread. What do you think?

--
Melanie Plageman

Attachment Content-Type Size
v1-0001-Describe-hybrid-hash-join-implementation.patch text/x-patch 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-04-30 14:40:59 Re: +(pg_lsn, int8) and -(pg_lsn, int8) operators
Previous Message Fujii Masao 2020-04-30 14:15:51 Back-patch is necessary? Re: Don't try fetching future segment of a TLI.