From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, pgsql-committers <pgsql-committers(at)postgresql(dot)org> |
Subject: | Re: pgsql: Add parallel-aware hash joins. |
Date: | 2017-12-30 22:34:17 |
Message-ID: | 30655.1514673257@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
>> This is explained by the early exit case in
>> ExecParallelHashEnsureBatchAccessors(). With just the right timing,
>> it finishes up not reporting the true nbatch number, and never calling
>> ExecParallelHashUpdateSpacePeak().
> Hi Tom,
> You mentioned that prairiedog sees the problem about one time in
> thirty. Would you mind checking if it goes away with this patch
> applied?
I've run 55 cycles of "make installcheck" without seeing a failure
with this patch installed. That's not enough to be totally sure
of course, but I think this probably fixes it.
However ... I noticed that my other dinosaur gaur shows the other failure
mode we see in the buildfarm, the "increased_batches = t" diff, and
I can report that this patch does *not* help that. The underlying
EXPLAIN output goes from something like
! Finalize Aggregate (cost=823.85..823.86 rows=1 width=8) (actual time=1378.102..1378.105 rows=1 loops=1)
! -> Gather (cost=823.63..823.84 rows=2 width=8) (actual time=1377.909..1378.006 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=823.63..823.64 rows=1 width=8) (actual time=1280.298..1280.302 rows=1 loops=3)
! -> Parallel Hash Join (cost=387.50..802.80 rows=8333 width=0) (actual time=1070.179..1249.142 rows=6667 loops=3)
! Hash Cond: (r.id = s.id)
! -> Parallel Seq Scan on simple r (cost=0.00..250.33 rows=8333 width=4) (actual time=0.173..62.063 rows=6667 loops=3)
! -> Parallel Hash (cost=250.33..250.33 rows=8333 width=4) (actual time=454.305..454.305 rows=6667 loops=3)
! Buckets: 4096 Batches: 8 Memory Usage: 208kB
! -> Parallel Seq Scan on simple s (cost=0.00..250.33 rows=8333 width=4) (actual time=0.178..67.115 rows=6667 loops=3)
! Planning time: 1.861 ms
! Execution time: 1687.311 ms
to something like
! Finalize Aggregate (cost=823.85..823.86 rows=1 width=8) (actual time=1588.733..1588.737 rows=1 loops=1)
! -> Gather (cost=823.63..823.84 rows=2 width=8) (actual time=1588.529..1588.634 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=823.63..823.64 rows=1 width=8) (actual time=1492.631..1492.635 rows=1 loops=3)
! -> Parallel Hash Join (cost=387.50..802.80 rows=8333 width=0) (actual time=1270.309..1451.501 rows=6667 loops=3)
! Hash Cond: (r.id = s.id)
! -> Parallel Seq Scan on simple r (cost=0.00..250.33 rows=8333 width=4) (actual time=0.219..158.144 rows=6667 loops=3)
! -> Parallel Hash (cost=250.33..250.33 rows=8333 width=4) (actual time=634.614..634.614 rows=6667 loops=3)
! Buckets: 4096 (originally 4096) Batches: 16 (originally 8) Memory Usage: 176kB
! -> Parallel Seq Scan on simple s (cost=0.00..250.33 rows=8333 width=4) (actual time=0.182..120.074 rows=6667 loops=3)
! Planning time: 1.931 ms
! Execution time: 2219.417 ms
so again we have a case where the plan didn't change but the execution
behavior did. This isn't quite 100% reproducible on gaur/pademelon,
but it fails more often than not seems like, so I can poke into it
if you can say what info would be helpful.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2017-12-30 23:28:58 | Re: pgsql: Add parallel-aware hash joins. |
Previous Message | Thomas Munro | 2017-12-30 21:59:26 | Re: pgsql: Add parallel-aware hash joins. |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2017-12-30 23:28:58 | Re: pgsql: Add parallel-aware hash joins. |
Previous Message | Thomas Munro | 2017-12-30 21:59:26 | Re: pgsql: Add parallel-aware hash joins. |