Re: Buffers from parallel workers not accumulated to upper nodes with gather merge

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Buffers from parallel workers not accumulated to upper nodes with gather merge
Date: 2020-07-20 07:01:48
Message-ID: CAA4eK1L28FjP4D7f7F3FMiq+x9wa5maJWG7vrBXUr0XFv20MMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jul 20, 2020 at 11:30 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sat, Jul 18, 2020 at 7:32 PM Jehan-Guillaume de Rorthais
> <jgdr(at)dalibo(dot)com> wrote:
> >
> > Hi,
> >
> > A bug report[1] on pev2 project exposes another bug from explain analyze
> > buffers, similar to the one fixed 2 years ago[2].
> >
> > In this report, buffers from the workers are not propagated to upper nodes
> > on top of the Gather Merge node. On the opposite, buffers from workers ARE
> > propagated to upper nodes on top of a Gather node.
> >
> > Here is some SQL to reproduce:
> >
> > CREATE TABLE contacts (contactSourceName text, filler text)
> > WITH ( autovacuum_enabled = off);
> > INSERT INTO contacts
> > SELECT rpad(chr(64+i)::text,10,'0'),
> > rpad(j::text,500,'0')
> > FROM generate_series(1,26) i
> > CROSS JOIN generate_series(1,3200) j;
> >
> > SET work_mem TO '4MB';
> > SET random_page_cost to 4;
> > SET jit TO off ;
> > SET parallel_setup_cost TO 0;
> > SET parallel_tuple_cost TO 0;
> >
> > EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF)
> > SELECT contactsourceName, count(*)
> > FROM contacts
> > GROUP BY contactSourceName;
> >
> > Finalize GroupAggregate (actual rows=26 loops=1)
> > Buffers: shared hit=2087 read=1463 written=32
> > -> Gather Merge (actual rows=70 loops=1)
> > Workers Planned: 2
> > Workers Launched: 2
> > Buffers: shared hit=3910 read=2049 written=96
> > -> Sort (actual rows=23 loops=3)
> > Buffers: shared hit=3910 read=2049 written=96
> > Worker 0:
> > Buffers: shared hit=914 read=287 written=32
> > Worker 1:
> > Buffers: shared hit=909 read=299 written=32
> > [...]
> >
> > Gather Merge exposes shared hit=3910.
> > Finalize GroupAggregate exposes shared hit=2087.
> > The 909 and 914 buffers from workers have not been propagated.
> >
> > Comparing with a Gather node:
> >
> > EXPLAIN (ANALYZE, BUFFERS, VERBOSE, COSTS OFF, TIMING OFF) SELECT count(*) AS
> > c FROM contacts;
> >
> > Finalize Aggregate (actual rows=1 loops=1)
> > Output: count(*)
> > Buffers: shared hit=3894 read=2049 written=96
> > -> Gather (actual rows=3 loops=1)
> > Workers Planned: 2
> > Workers Launched: 2
> > Buffers: shared hit=3894 read=2049 written=96
> > -> Partial Aggregate (actual rows=1 loops=3)
> > Buffers: shared hit=3894 read=2049 written=96
> > Worker 0:
> > Buffers: shared hit=870 read=299 written=32
> > Worker 1:
> > Buffers: shared hit=887 read=301 written=32
> > [...]
> >
> > Buffers are correctly propagated to the Finalize Aggregate node.
> >
> > I'm not perfectly clear with the code, but here is my analysis of the issue
> > and a fix proposal.
> >
> > The Merge node works correctly because it calls ExecShutdownGatherWorkers as
> > soon as the workers are exhausted from gather_readnext. Because of this,
> > buffers from workers are already accounted and propagated to upper nodes
> > before the recursive call of ExecShutdownNode on each nodes. There's no similar
> > call to ExecShutdownGatherMergeWorkers for Gather Merge. Adding a call to
> > ExecShutdownGatherMergeWorkers in gather_merge_getnext when workers are
> > exhausted seems to fix the issue, but I feel like this is the wrong place to
> > fix this issue.
> >
>
> Why do you think so? I think irrespective of whether we want to call
> ExecShutdownGatherMergeWorkers in gather_merge_getnext (when we know
> we are done aka binaryheap_empty) to fix this particular issue, it is
> better to shutdown the workers as soon as we are done similar to what
> we do for Gather node. It is good to release resources as soon as we
> can.
>

Apart from above, we should also do what you are suggesting in the
patch as that is also a problem and your fix seems correct to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2020-07-20 07:59:27 Re: Buffers from parallel workers not accumulated to upper nodes with gather merge
Previous Message Amit Kapila 2020-07-20 06:00:34 Re: Buffers from parallel workers not accumulated to upper nodes with gather merge