Re: [POC] Faster processing at Gather node

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] Faster processing at Gather node
Date: 2017-11-09 05:08:22
Message-ID: CAA4eK1+aRt=0mXqtkw62oyj6cLsPbv4DoUm0OtzF=55yKuRG8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 8, 2017 at 1:02 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> On 2017-11-06 10:56:43 +0530, Amit Kapila wrote:
>> On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund <andres(at)anarazel(dot)de> wrote
>> > On 2017-11-05 01:05:59 +0100, Robert Haas wrote:
>> >> skip-gather-project-v1.patch does what it says on the tin. I still
>> >> don't have a test case for this, and I didn't find that it helped very
>> >> much,
>>
>> I am also wondering in which case it can help and I can't think of the
>> case.
>
> I'm confused? Isn't it fairly obvious that unnecessarily projecting
> at the gather node is wasteful? Obviously depending on the query you'll
> see smaller / bigger gains, but that there's beenfdits should be fairly obvious?
>
>

I agree that there could be benefits depending on the statement. I
initially thought that we are kind of re-evaluating the expressions in
target list as part of projection even if worker backend has already
done that, but that was not the case and instead, we are deforming the
tuples sent by workers. Now, I think as a general principle it is a
good idea to delay the deforming as much as possible.

About the patch,

/*
- * Initialize result tuple type and projection info.
- */
- ExecAssignResultTypeFromTL(&gatherstate->ps);
- ExecAssignProjectionInfo(&gatherstate->ps, NULL);
-

- /*
* Initialize funnel slot to same tuple descriptor as outer plan.
*/
if (!ExecContextForcesOids(&gatherstate->ps, &hasoid))
@@ -115,6 +109,12 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
tupDesc = ExecTypeFromTL(outerNode->targetlist, hasoid);
ExecSetSlotDescriptor(gatherstate->funnel_slot, tupDesc);

+ /*
+ * Initialize result tuple type and projection info.
+ */
+ ExecAssignResultTypeFromTL(&gatherstate->ps);
+ ExecConditionalAssignProjectionInfo(&gatherstate->ps, tupDesc, OUTER_VAR);
+

This change looks suspicious to me. I think here we can't use the
tupDesc constructed from targetlist. One problem, I could see is that
the check for hasOid setting in tlist_matches_tupdesc won't give the
correct answer. In case of the scan, we use the tuple descriptor
stored in relation descriptor which will allow us to take the right
decision in tlist_matches_tupdesc. If you try the statement CREATE
TABLE as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; in
force_parallel_mode=regress, then you can reproduce the problem I am
trying to highlight.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-11-09 05:09:54 Re: Pg V10: Patch for bug in bonjour support
Previous Message Tom Lane 2017-11-09 04:03:44 Re: Pg V10: Patch for bug in bonjour support