Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Hubert Lubaczewski <depesz(at)depesz(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: explain analyze output with parallel workers - question about meaning of information for explain.depesz.com
Date: 2017-12-05 21:23:53
Message-ID: CAEepm=2-LRnfwUBZDqQt+XAcd0af_ykNyyVvP3h1uB1AQ=e-eA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 6, 2017 at 12:02 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Dec 5, 2017 at 1:29 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> Or would it be an option to change the time
>> ExecXXXRetrieveInstrumentation() is called so it is run only once?
>
> To me, that doesn't sound like a bad option. I think if do so, then
> we don't even need to reinitialize the shared sort stats. I think
> something, like attached, should work if we want to go this route. We
> can add regression test if this is what we think is a good idea.
> Having said that, one problem I see doing thing this way is that in
> general, we will display the accumulated stats of each worker, but for
> sort or some other special nodes (like hash), we will show the
> information of only last loop. I am not sure if that is a matter of
> concern, but if we want to do this way, then probably we should
> explain this in documentation as well.

The hash version of this code is now committed as 5bcf389e. Here is a
patch for discussion that adds some extra tests to join.sql to
exercise rescans of a hash join under a Gather node. It fails on
head, because it loses track of the instrumentation pointer after the
first loop as you described (since the Hash coding is the same is the
Sort coding), so it finishes up with no instrumentation data. If you
move ExecParallelRetrieveInstrumentation() to ExecParallelCleanup() as
you showed in your patch, then it passes. The way I'm asserting that
instrumentation data is making its way back to the leader is by
turning off leader participation and then checking if it knows how
many batches there were.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
test-hash-join-rescan-instr-v1.patch application/octet-stream 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-12-05 22:31:22 pgsql: Support Parallel Append plan nodes.
Previous Message Alexander Korotkov 2017-12-05 20:59:33 Re: compress method for spgist - 2