Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hubert Lubaczewski <depesz(at)depesz(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com
Date: 2017-12-05 05:39:56
Message-ID: CAA4eK1LmjXY3TtMLCa=nWpKVFhz78=OcMvDWUH6rTe5vHFk2Aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 4, 2017 at 11:17 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Dec 2, 2017 at 8:04 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Attached patch contains regression test as well. Note that I have
>> carefully disabled all variable stats by using (analyze, timing off,
>> summary off, costs off) and then selected parallel sequential scan on
>> the right of join so that we have nloops and rows as variable stats
>> and those should remain constant.
>
> The regression test contains a whitespace error about which git diff
> --check complains.
>

oops, a silly mistake from my side.

> Also, looking at this again, shouldn't the reinitialization of the
> instrumentation arrays happen in ExecParallelReinitialize rather than
> ExecParallelFinish, so that we don't spend time doing it unless the
> Gather is actually re-executed?
>

Yeah, that sounds better, so modified the patch accordingly.

I have one another observation in the somewhat related area. From the
code, it looks like we might have some problem with displaying sort
info for workers for rescans. I think the problem with the sortinfo
is that it initializes shared info with local memory in
ExecSortRetrieveInstrumentation after which it won't be able to access
the values in shared memory changed by workers in rescans. We might
be able to fix it by having some local_info same as sahred_info in
sort node. But the main problem is how do we accumulate stats for
workers across rescans. The type of sort method can change across
rescans. We might be able to accumulate the size of Memory though,
but not sure if that is right. I think though this appears to be
somewhat related to the problem being discussed in this thread, it can
be dealt separately if we want to fix it.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
fix_accum_instr_parallel_workers_v3.patch application/octet-stream 4.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2017-12-05 05:50:23 Re: Mention ordered datums in PartitionBoundInfoData comment
Previous Message Rajkumar Raghuwanshi 2017-12-05 05:34:51 Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables