Re: pgsql: Add parallel-aware hash joins.

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2017-12-28 04:15:00
Message-ID: CAEepm=1OyZcCAxAEWYhmeg0qmWyYF2O2yW7s7A4uVTt-dgRRyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
>> I'll address the instability of the regression test output separately.
>
> If you're still looking for data on that --- prairiedog is able to
> reproduce the "multibatch = f" variant about one time in thirty.
> I modified the test case to print out the full EXPLAIN ANALYZE output
> rather than a heavily filtered version. Here's a typical successful run:
>
> ! Buckets: 1024 (originally 2048) Batches: 8 (originally 1) Memory Usage: 321kB
> ! Execution time: 424.395 ms
>
> and here's a failing run:
>
> ! Buckets: 1024 (originally 2048) Batches: 1 (originally 1) Memory Usage: 0kB
> ! Execution time: 243.120 ms
>
> I don't have enough insight to be totally sure what this means, but the
> "Memory Usage: 0kB" bit is obviously bogus, so I'd venture that at least
> part of the issue is failure to return stats from a worker.

Hmm. Yeah, that seems quite likely -- thanks. Investigating now.

> I also find
> it most curious that the "success" case is a lot slower than the "not
> success" case. Perhaps this is related to your livelock issue? Doing
> another run, I get something even slower:
>
> ! Buckets: 1024 (originally 2048) Batches: 8 (originally 1) Memory Usage: 321kB
> ! Execution time: 509.607 ms

Yes, that looks a lot like the ConditionVariableBroadcast() livelock
problem. That query with the same 8-batch 2-worker plan runs in ~12ms
and ~19ms for me on two different machines here without the run-to-run
variation you see.

> Aside from the instability problems, I'm pretty unhappy about how much
> the PHJ patch has added to the runtime of "make check". I do not think
> any one feature can justify adding 20% to that. Can't you cut down the
> amount of data processed by these new test cases?

Isn't that mostly because of the CV livelock problem? Here are the
real times I got with "time make check" on my slow 2 core Intel(R)
Celeron(R) CPU G1610T @ 2.30GHz running FreeBSD, which seems to be
prone to that problem on certain queries:

84940644 = 34.83s (before hash join regression tests)
fa330f9a = 35.81s (hash join regression tests)
18042840 = 44.92s (parallel-aware hash joins + new tests)

So the PHJ patch made it take 25% longer, similar to what you
reported. But then if I apply the following band-aid kludge to
condition_variable.c to limit the looping, I get:

- while (ConditionVariableSignal(cv))
+ while (nwoken < max_parallel_workers && ConditionVariableSignal(cv))
++nwoken;

18042840 + kludge = 36.70s

So without the effects of that bug it's only taking 2.4% longer than
commit fa330f9a. Is that acceptable for a feature of this size and
complexity? I will also look into making the data sets smaller.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2017-12-28 04:26:35 Re: pgsql: Add parallel-aware hash joins.
Previous Message Tom Lane 2017-12-28 02:32:26 Re: pgsql: Add parallel-aware hash joins.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-12-28 04:26:35 Re: pgsql: Add parallel-aware hash joins.
Previous Message Craig Ringer 2017-12-28 02:51:02 Re: Contributing some code