Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, James Coleman <jtc331(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Date: 2019-12-21 03:02:28
Message-ID: CA+hUKGKA+tKg1v29=mfxbTkXSoRdFTzmD3R0U2YgwxRxLbLtMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Dec 14, 2019 at 1:22 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > >On 2019-Dec-10, Tomas Vondra wrote:
> > >> It's probably still a good idea to do this, although I wonder if we
> > >> might start doing this only when actually running out of bits (and use
> > >> the current algorithm by default). But I have no idea if that would be
> > >> any cheaper, and it would add complexity.

> - *batchno = (hashvalue >> hashtable->log2_nbuckets) &
> (nbatch - 1);
> + *batchno = rotate(hashvalue, hashtable->log2_nbuckets)
> & (nbatch - 1);

I ran the 150MB 4096 batch "sevenb" self-join with the "rotate" patch,
and it worked as expected. I'm now planning to commit that version,
unless there are objections or someone wants to argue for a different
way to spell rotate() etc.

If I disassemble and diff unpatched and patched
ExecHashGetBucketAndBatch() and functions that inline it, as compiled
by Clang 6, I see just eg:

- 0x00000000006ba54e <+30>: shr %cl,%esi
+ 0x00000000006ba54e <+30>: ror %cl,%esi

Here's a festive pictorial representation of the various dead horses
flogged in this thread, for the holidays (view in monospace font):

The "shift" algorithm in (pre-existing coding):

<----------batchno---|
|---bucketno---|
[MSB . . . . . . . . . . . . LSB]

* can't increase nbuckets (or max bucket load would degrade too soon)
* reads zeroes off the end, so further partitioning is broken

The "bit reverse" technique proposed earlier:

|---batchno----------->
<---bucketno---|
[MSB . . . . . . . . . . . . LSB]

* could increase nbuckets: bucket load degradation is maximally deferred
* takes a few more instructions to compute batchno

The "rotate" technique:

------batchno---| <---
|---bucketno---|
[MSB . . . . . . . . . . . . LSB]

* same as "shift" until you run out of bits, then bucket load degrades
* can't increase nbuckets
* only changes one machine code instruction

The "swap ends" technique:

<----------batchno---|
|---bucketno---|
[MSB . . . . . . . . . . . . LSB]

* really just a rotated version of "rotate" technique
* meh

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-12-21 05:10:31 Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Previous Message Tom Lane 2019-12-21 02:41:52 Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)