Quick Links

Re: pgsql: Add parallel-aware hash joins.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject:	Re: pgsql: Add parallel-aware hash joins.
Date:	2018-01-03 18:03:39
Message-ID:	18804.1515002619@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-committers pgsql-hackers

Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> On Wed, Jan 3, 2018 at 2:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hm. That could do it, except it doesn't really account for the observed
>> result that slower single-processor machines seem more prone to the
>> bug. Surely they should be less likely to get multiple workers activated.

> It plans for 8 batches, and then usually but not always goes to 16 at
> execution time depending on timing. It doesn't happen for me with
> 128kB (the setting used in the regression test), but I think the
> affected BF machines are all 32 bit systems that have MAXIMUM_ALIGNOF
> 8 and therefore use a bit more space, whereas my machines have
> MAXIMUM_ALIGNOF 4 in a 32 bit build, so that would explain the
> different location of this unstable border.

Ah, that might do it. You're right that gaur/pademelon are in that class.

> We could probably fix the
> failures by simply increasing work_mem out of that zone, but I'm
> hoping to fix the problem in a more principled way. More soon.

Meh. I think we're going to end up having to pick a modified test
case that's further away from the chunk size threshold. I do not
think it is possible to predict this runtime behavior exactly at
plan time, nor am I convinced that expending planner cycles on a
somewhat closer estimate is a useful use of planning time.

>> I'm also wondering why the non-parallel path seems to prefer to allocate
>> in units of HASH_CHUNK_SIZE + HASH_CHUNK_HEADER_SIZE while the parallel
>> path targets allocations of exactly HASH_CHUNK_SIZE,

> That is intentional: dsa.c sucks at allocating 32KB + a tiny bit
> because it works in 4KB pages for large allocations, so I wanted to
> make HASH_CHUNK_SIZE the total size that arrives into dsa_allocate().
> The non-parallel path has similar problems on some libc
> implementations, as we discussed over here:
> https://www.postgresql.org/message-id/flat/29770.1511495642%40sss.pgh.pa.us

Oh, right, I'd forgotten about that discussion. I think it would be
good to adjust hashjoin so that both paths are targeting 32KB total;
but you are right that there's not a lot of point in fooling with the
non-parallel path until we add some way of accounting for aset.c's
overhead too. We can leave that for later.

>> BTW, I'm seeing a few things that look bug-like about
>> ExecParallelHashTuplePrealloc, ...

> Right. Fixed in the attached.

Pushed, plus an additional Assert to clarify that we're expecting
ExecParallelHashTuplePrealloc's caller to have already maxalign'ed
the request size.

regards, tom lane

In response to

Re: pgsql: Add parallel-aware hash joins. at 2018-01-03 12:06:38 from Thomas Munro

Responses

Re: pgsql: Add parallel-aware hash joins. at 2018-01-04 01:17:04 from Tom Lane

Browse pgsql-committers by date

	From	Date	Subject
Next Message	Tom Lane	2018-01-03 19:16:07	Re: pgsql: Fix deadlock hazard in CREATE INDEX CONCURRENTLY
Previous Message	Tom Lane	2018-01-03 17:53:54	pgsql: Fix some minor errors in new PHJ code.

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2018-01-03 18:34:02	Re: to_timestamp TZH and TZM format specifiers
Previous Message	Andrew Dunstan	2018-01-03 18:03:01	to_timestamp TZH and TZM format specifiers