Re: pgsql: Add parallel-aware hash joins.

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2017-12-21 09:54:44
Message-ID: CAEepm=1HL7iLoAEOsO5vf_fE7p_AgXW7B6ii9C0Vs_yGL2bOPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Dec 21, 2017 at 10:29 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-12-21 08:49:46 +0000, Andres Freund wrote:
>> Add parallel-aware hash joins.
>
> There's to relatively mundane failures:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2017-12-21%2008%3A48%3A12
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=termite&dt=2017-12-21%2008%3A50%3A08

Right, it looks like something takes more space on ppc systems causing
a batch increase that doesn't happen on amd64. I'll come back to
that.

> but also one that's a lot more interesting:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=capybara&dt=2017-12-21%2008%3A50%3A08
>
> which shows an assert failure:
>
> #2 0x00000000008687d1 in ExceptionalCondition (conditionName=conditionName(at)entry=0xa76a98 "!(!accessor->sts->participants[i].writing)", errorType=errorType(at)entry=0x8b2c49 "FailedAssertion", fileName=fileName(at)entry=0xa76991 "sharedtuplestore.c", lineNumber=lineNumber(at)entry=273) at assert.c:54
> #3 0x000000000089883e in sts_begin_parallel_scan (accessor=0xfaf780) at sharedtuplestore.c:273
> #4 0x0000000000634de4 in ExecParallelHashRepartitionRest (hashtable=0xfaec18) at nodeHash.c:1369
> #5 ExecParallelHashIncreaseNumBatches (hashtable=0xfaec18) at nodeHash.c:1198
> #6 0x000000000063546b in ExecParallelHashTupleAlloc (hashtable=hashtable(at)entry=0xfaec18, size=40, shared=shared(at)entry=0x7ffee26a8868) at nodeHash.c:2778
> #7 0x00000000006357c8 in ExecParallelHashTableInsert (hashtable=hashtable(at)entry=0xfaec18, slot=slot(at)entry=0xfa76f8, hashvalue=<optimized out>) at nodeHash.c:1696
> #8 0x0000000000635b5f in MultiExecParallelHash (node=0xf7ebc8) at nodeHash.c:288
> #9 MultiExecHash (node=node(at)entry=0xf7ebc8) at nodeHash.c:112
>
> which seems to suggest that something in the state machine logic is
> borked. ExecParallelHashIncreaseNumBatches() should've ensured that
> everyone has called sts_end_write()...

Hmm. This looks the same as the one-off single assertion failure that
I mentioned[1] and had not been able to reproduce. Investigating.

[1] https://www.postgresql.org/message-id/CAEepm%3D0oE%3DyO0Kam86W1d-iJoasWByYkcrkDoJu6t5mRhFGHkQ%40mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2017-12-21 09:55:50 Re: pgsql: Add parallel-aware hash joins.
Previous Message Andres Freund 2017-12-21 09:29:40 Re: pgsql: Add parallel-aware hash joins.

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-12-21 09:55:50 Re: pgsql: Add parallel-aware hash joins.
Previous Message Neto BR 2017-12-21 09:39:37 Re: Cost Model