Re: Explain buffers wrong counter with parallel plans

From: "Jonathan S(dot) Katz" <jonathan(dot)katz(at)excoventures(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain buffers wrong counter with parallel plans
Date: 2018-07-25 14:12:29
Message-ID: BEC5CFD6-ADF1-4D12-938D-F2C38575EC63@excoventures.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On Jul 7, 2018, at 12:03 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sat, Jul 7, 2018 at 7:45 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com <mailto:amit(dot)kapila16(at)gmail(dot)com>> wrote:
>> On Sat, Jul 7, 2018 at 12:44 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Fri, Jul 6, 2018 at 9:44 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>>> I have tried this idea, but it doesn't completely solve the problem.
>>>> The problem is that nodes below LIMIT won't get a chance to accumulate
>>>> the stats as they won't be able to call InstrStopNode.
>>>
>>> I'm not sure I understand. Why not? I see that we'd need to insert
>>> an extra call to InstrStopNode() if we were stopping the node while it
>>> was running, because then InstrStartNode() would have already been
>>> done, but the corresponding call to InstrStopNode() would not have
>>> been done. But I'm not sure how that would happen in this case. Can
>>> you explain further?
>>>
>>
>> Okay, let me try. The code flow is that for each node we will call
>> InstrStartNode()->ExecProcNodeReal()->InstrStopNode(). Now let's say
>> we have to execute a plan Limit->Gather-> Parallel SeqScan. In this,
>> first for Limit node, we will call InstrStartNode() and
>> ExecProcNodeReal() and then for Gather we will call InstrStartNode(),
>> ExecProcNodeReal() and InstrStopNode(). Now, Limit node decides that
>> it needs to shutdown all the nodes (ExecShutdownNode) and after that
>> it will call InstrStopNode() for Limit node. So, in this flow after
>> shutting down nodes, we never get chance for Gather node to use stats
>> collected during ExecShutdownNode.
>>
>
> I went ahead and tried the solution which I had mentioned yesterday,
> that is to allow ExecShutdownNode to count stats. Apart from fixing
> this problem, it will also fix the problem with Gather Merge as
> reported by Adrien [1], because now Gather Merge will also get a
> chance to count stats after shutting down workers.
>
> Note that, I have changed the location of InstrStartParallelQuery in
> ParallelQueryMain so that the buffer usage stats are accumulated only
> for the plan execution which is what we do for instrumentation
> information as well. If we don't do that, it will count some
> additional stats for ExecutorStart which won't match with what we have
> in Instrumentation structure of each node.

I tried running the below on both 11beta2 and HEAD with the patch
applied:

CREATE UNLOGGED TABLE t1 (c1 int);
INSERT INTO t1 SELECT generate_series(1,100000000);
/** restart PostgreSQL */
EXPLAIN (analyze,buffers,timing off,costs off)
SELECT count(*) FROM t1;
/** repeat above two queries */

I have identical postgresql.conf files on both instances as well.

11beta2
======
buffers=# explain (analyze,buffers,timing off,costs off) select count(*)
from t1;
QUERY PLAN
--------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
Buffers: shared read=63175
-> Gather (actual rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
Buffers: shared read=63175
-> Partial Aggregate (actual rows=1 loops=7)
Buffers: shared read=442478
-> Parallel Seq Scan on t1 (actual rows=14285714 loops=7)
Buffers: shared read=442478
Planning Time: 1.422 ms
Execution Time: 3214.407 ms
(12 rows)

buffers=# explain (analyze,buffers,timing off,costs off) select count(*)
from t1;
QUERY PLAN
--------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
Buffers: shared hit=27 read=64960
-> Gather (actual rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
Buffers: shared hit=27 read=64960
-> Partial Aggregate (actual rows=1 loops=7)
Buffers: shared hit=224 read=442254
-> Parallel Seq Scan on t1 (actual rows=14285714 loops=7)
Buffers: shared hit=224 read=442254
Planning Time: 0.080 ms
Execution Time: 3774.091 ms
(12 rows)

HEAD
=====
buffers=# explain (analyze,buffers,timing off,costs off) select count(*)
from t1;
QUERY PLAN
--------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
Buffers: shared read=442478
-> Gather (actual rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
Buffers: shared read=442478
-> Partial Aggregate (actual rows=1 loops=7)
Buffers: shared read=442478
-> Parallel Seq Scan on t1 (actual rows=14285714 loops=7)
Buffers: shared read=442478
Planning Time: 1.594 ms
Execution Time: 6207.799 ms
(12 rows)

buffers=# explain (analyze,buffers,timing off,costs off) select count(*)
from t1;
QUERY PLAN
--------------------------------------------------------------------------
Finalize Aggregate (actual rows=1 loops=1)
Buffers: shared hit=224 read=442254
-> Gather (actual rows=7 loops=1)
Workers Planned: 6
Workers Launched: 6
Buffers: shared hit=224 read=442254
-> Partial Aggregate (actual rows=1 loops=7)
Buffers: shared hit=224 read=442254
-> Parallel Seq Scan on t1 (actual rows=14285714 loops=7)
Buffers: shared hit=224 read=442254
Planning Time: 0.074 ms
Execution Time: 5006.395 ms
(12 rows)

Notice the “read” numbers just before the “Finalize Aggregate” is much
higher, and there appears to be a performance hit.

Jonathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2018-07-25 14:18:30 Re: Missing pg_control crashes postmaster
Previous Message Michael Paquier 2018-07-25 14:08:33 Re: [HACKERS] WAL logging problem in 9.4.3?