Re: pgsql: Add parallel-aware hash joins.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-committers <pgsql-committers(at)postgresql(dot)org>
Subject: Re: pgsql: Add parallel-aware hash joins.
Date: 2017-12-28 02:32:26
Message-ID: 30219.1514428346@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> I'll address the instability of the regression test output separately.

If you're still looking for data on that --- prairiedog is able to
reproduce the "multibatch = f" variant about one time in thirty.
I modified the test case to print out the full EXPLAIN ANALYZE output
rather than a heavily filtered version. Here's a typical successful run:

! explain analyze
select length(max(s.t))
from wide left join (select id, coalesce(t, '') || '' as t from wide) s using (id);
! QUERY PLAN
! ------------------------------------------------------------------------------------------------------------------------------------------------
! Finalize Aggregate (cost=97.49..97.51 rows=1 width=4) (actual time=409.330..409.331 rows=1 loops=1)
! -> Gather (cost=97.28..97.49 rows=2 width=32) (actual time=376.957..395.841 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=97.28..97.29 rows=1 width=32) (actual time=254.370..254.373 rows=1 loops=3)
! -> Parallel Hash Left Join (cost=23.23..88.88 rows=3360 width=32) (actual time=240.812..241.297 rows=1 loops=3)
! Hash Cond: (wide.id = wide_1.id)
! -> Parallel Seq Scan on wide (cost=0.00..15.29 rows=529 width=4) (actual time=0.066..0.075 rows=1 loops=3)
! -> Parallel Hash (cost=16.61..16.61 rows=529 width=36) (actual time=109.565..109.565 rows=1 loops=3)
! Buckets: 1024 (originally 2048) Batches: 8 (originally 1) Memory Usage: 321kB
! -> Parallel Seq Scan on wide wide_1 (cost=0.00..16.61 rows=529 width=36) (actual time=2.989..7.218 rows=1 loops=3)
! Planning time: 1.317 ms
! Execution time: 424.395 ms
! (13 rows)

and here's a failing run:

! explain analyze
select length(max(s.t))
from wide left join (select id, coalesce(t, '') || '' as t from wide) s using (id);
! QUERY PLAN
! -------------------------------------------------------------------------------------------------------------------------------------------------
! Finalize Aggregate (cost=97.49..97.51 rows=1 width=4) (actual time=232.440..232.441 rows=1 loops=1)
! -> Gather (cost=97.28..97.49 rows=2 width=32) (actual time=225.738..226.744 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=97.28..97.29 rows=1 width=32) (actual time=29.377..29.379 rows=1 loops=3)
! -> Parallel Hash Left Join (cost=23.23..88.88 rows=3360 width=32) (actual time=22.747..25.340 rows=1 loops=3)
! Hash Cond: (wide.id = wide_1.id)
! -> Parallel Seq Scan on wide (cost=0.00..15.29 rows=529 width=4) (actual time=0.086..0.113 rows=2 loops=1)
! -> Parallel Hash (cost=16.61..16.61 rows=529 width=36) (actual time=16.382..16.382 rows=1 loops=3)
! Buckets: 1024 (originally 2048) Batches: 1 (originally 1) Memory Usage: 0kB
! -> Parallel Seq Scan on wide wide_1 (cost=0.00..16.61 rows=529 width=36) (actual time=9.167..21.301 rows=2 loops=1)
! Planning time: 1.289 ms
! Execution time: 243.120 ms
! (13 rows)

I don't have enough insight to be totally sure what this means, but the
"Memory Usage: 0kB" bit is obviously bogus, so I'd venture that at least
part of the issue is failure to return stats from a worker. I also find
it most curious that the "success" case is a lot slower than the "not
success" case. Perhaps this is related to your livelock issue? Doing
another run, I get something even slower:

! explain analyze
select length(max(s.t))
from wide left join (select id, coalesce(t, '') || '' as t from wide) s using (id);
! QUERY PLAN
! -------------------------------------------------------------------------------------------------------------------------------------------------
! Finalize Aggregate (cost=97.49..97.51 rows=1 width=4) (actual time=487.245..487.246 rows=1 loops=1)
! -> Gather (cost=97.28..97.49 rows=2 width=32) (actual time=444.650..475.390 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=97.28..97.29 rows=1 width=32) (actual time=345.816..345.819 rows=1 loops=3)
! -> Parallel Hash Left Join (cost=23.23..88.88 rows=3360 width=32) (actual time=334.229..338.098 rows=1 loops=3)
! Hash Cond: (wide.id = wide_1.id)
! -> Parallel Seq Scan on wide (cost=0.00..15.29 rows=529 width=4) (actual time=0.065..0.074 rows=1 loops=3)
! -> Parallel Hash (cost=16.61..16.61 rows=529 width=36) (actual time=140.210..140.210 rows=1 loops=3)
! Buckets: 1024 (originally 2048) Batches: 8 (originally 1) Memory Usage: 321kB
! -> Parallel Seq Scan on wide wide_1 (cost=0.00..16.61 rows=529 width=36) (actual time=4.233..15.117 rows=1 loops=3)
! Planning time: 1.380 ms
! Execution time: 509.607 ms
! (13 rows)

Aside from the instability problems, I'm pretty unhappy about how much
the PHJ patch has added to the runtime of "make check". I do not think
any one feature can justify adding 20% to that. Can't you cut down the
amount of data processed by these new test cases?

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Thomas Munro 2017-12-28 04:15:00 Re: pgsql: Add parallel-aware hash joins.
Previous Message Tom Lane 2017-12-27 22:59:26 Re: pgsql: Use new overflow aware integer operations.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2017-12-28 02:40:37 RE: [HACKERS] Transactions involving multiple postgres foreign servers
Previous Message David Rowley 2017-12-28 02:07:49 Re: [HACKERS] path toward faster partition pruning