Re: Avoiding hash join batch explosions with extreme skew and weird stats

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding hash join batch explosions with extreme skew and weird stats
Date: 2019-07-13 23:44:52
Message-ID: 20190713234452.dsh4vchhb5eobybh@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 03, 2019 at 02:22:09PM -0700, Melanie Plageman wrote:
>On Tue, Jun 18, 2019 at 3:24 PM Melanie Plageman <melanieplageman(at)gmail(dot)com>
>wrote:
>
>>
>> These questions will probably make a lot more sense with corresponding
>> code, so I will follow up with the second version of the state machine
>> patch once I finish it.
>>
>>
>I have changed the state machine and resolved the questions I had
>raised in the previous email. This seems to work for the parallel and
>non-parallel cases. I have not yet rewritten the unmatched outer tuple
>status as a bitmap in a spill file (for ease of debugging).
>
>Before doing that, I wanted to ask what a desirable fallback condition
>would be. In this patch, fallback to hashloop join happens only when
>inserting tuples into the hashtable after batch 0 when inserting
>another tuple from the batch file would exceed work_mem. This means
>you can't increase nbatches, which, I would think is undesirable.
>

Yes, I think that's undesirable.

>I thought a bit about when fallback should happen. So, let's say that
>we would like to fallback to hashloop join when we have increased
>nbatches X times. At that point, since we do not want to fall back to
>hashloop join for all batches, we have to make a decision. After
>increasing nbatches the Xth time, do we then fall back for all batches
>for which inserting inner tuples exceeds work_mem? Do we use this
>strategy but work_mem + some fudge factor?
>
>Or, do we instead try to determine if data skew led us to increase
>nbatches both times and then determine which batch, given new
>nbatches, contains that data, set fallback to true only for that
>batch, and let all other batches use the existing logic (with no
>fallback option) unless they contain a value which leads to increasing
>nbatches X number of times?
>

I think we should try to detect the skew and use this hashloop logic
only for the one batch. That's based on the assumption that the hashloop
is less efficient than the regular hashjoin.

We may need to apply it even for some non-skewed (but misestimated)
cases, though. At some point we'd need more than work_mem for BufFiles,
at which point we ought to use this hashloop.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2019-07-14 00:34:40 Re: SHOW CREATE
Previous Message Michael Glaesemann 2019-07-13 23:32:41 Re: SHOW CREATE