Re: Explain buffers wrong counter with parallel plans

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Jonathan S(dot) Katz" <jonathan(dot)katz(at)excoventures(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain buffers wrong counter with parallel plans
Date: 2018-07-31 03:30:00
Message-ID: CAA4eK1Ja_UeYCKVg1JKCeUUsH=Pb7Sg_qaWgzF2D7nxWr9m83g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 31, 2018 at 12:58 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> I'm not an expert in the area of the code, but here's a review anyway. I
> did not read through the entire thread.
>
>
> I think we should try to get this fixed soon, to make some progress
> towards release-ability. Or just declare it to be entirely unrelated to
> the release, and remove it from the open items list; not an unreasonable
> choice either. This has been an open item for quite a while...
>

I am okay either way, but I can work on getting it committed for this release.

>
>> diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
>> index 60aaa822b7e..ac22bedf0e2 100644
>> --- a/src/backend/executor/execParallel.c
>> +++ b/src/backend/executor/execParallel.c
>> @@ -979,9 +979,6 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
>> /* Report workers' query for monitoring purposes */
>> pgstat_report_activity(STATE_RUNNING, debug_query_string);
>>
>> - /* Prepare to track buffer usage during query execution. */
>> - InstrStartParallelQuery();
>> -
>> /* Attach to the dynamic shared memory area. */
>> area_space = shm_toc_lookup(toc, PARALLEL_KEY_DSA, false);
>> area = dsa_attach_in_place(area_space, seg);
>> @@ -993,6 +990,9 @@ ParallelQueryMain(dsm_segment *seg, shm_toc *toc)
>> queryDesc->planstate->state->es_query_dsa = area;
>> ExecParallelInitializeWorker(queryDesc->planstate, toc);
>>
>> + /* Prepare to track buffer usage during query execution. */
>> + InstrStartParallelQuery();
>> +
>> /* Run the plan */
>> ExecutorRun(queryDesc, ForwardScanDirection, 0L, true);
>
> Isn't this an independent change? And one with potentially negative
> side effects? I think there's some arguments for changing this (and some
> against),
>

This is to ensure that buffer usage stats are tracked consistently in
master and worker backends. In the master backend, we don't track
the similar stats for ExecutorStart phase. Also, it would help us
give the same stats for buffer usage in parallel and non-parallel
version of the query.

> but I think it's a bad idea to do so in the same patch.
>

Agreed, we can do this part separately.

>
>> diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
>> index 36d2914249c..a0d49ec0fba 100644
>> --- a/src/backend/executor/execProcnode.c
>> +++ b/src/backend/executor/execProcnode.c
>> @@ -737,6 +737,13 @@ ExecShutdownNode(PlanState *node)
>>
>> planstate_tree_walker(node, ExecShutdownNode, NULL);
>>
>> + /*
>> + * Allow instrumentation to count stats collected during shutdown for
>> + * nodes that are executed atleast once.
>> + */
>> + if (node->instrument && node->instrument->running)
>> + InstrStartNode(node->instrument);
>> +
>> switch (nodeTag(node))
>> {
>> case T_GatherState:
>> @@ -755,5 +762,12 @@ ExecShutdownNode(PlanState *node)
>> break;
>> }
>>
>> + /*
>> + * Allow instrumentation to count stats collected during shutdown for
>> + * nodes that are executed atleast once.
>> + */
>> + if (node->instrument && node->instrument->running)
>> + InstrStopNode(node->instrument, 0);
>> +
>> return false;
>> }
>
> Duplicating the comment doesn't seem like a good idea. Just say
> something like "/* see comment for InstrStartNode above */".
>

Okay, will change in the next version.

>
>> diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
>> index ac5a2ff0e60..cf1851e235f 100644
>> --- a/src/backend/executor/nodeLimit.c
>> +++ b/src/backend/executor/nodeLimit.c
>> @@ -134,6 +134,8 @@ ExecLimit(PlanState *pstate)
>> node->position - node->offset >= node->count)
>> {
>> node->lstate = LIMIT_WINDOWEND;
>> + /* Allow nodes to release or shut down resources. */
>> + (void) ExecShutdownNode(outerPlan);
>> return NULL;
>> }
>>
>
> Um, is this actually correct? What if somebody rewinds afterwards?
> That won't happen for parallel query currently, but architecturally I
> don't think we can do so unconditionally?
>

Currently, this is done for the forward scans (basically under
ScanDirectionIsForward), is that enough, if not, what kind of
additional check do you think we need here? I am not sure at this
stage how we will design it for backward scans, but I think we already
do this at other places in the code, see below code in ExecutePlan.

ExecutePlan()
{
..
if (numberTuples && numberTuples == current_tuple_count)
{
/* Allow nodes to release or shut down resources. */
(void) ExecShutdownNode(planstate);
break;
}
..
}

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2018-07-31 04:35:56 Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist
Previous Message Fabien COELHO 2018-07-31 02:51:04 pg_default_acl missing 'n' case in doc