Re: Parallel Aggregate

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregate
Date: 2016-03-15 22:00:13
Message-ID: CA+TgmoaLAsAH_Eki=zZfx17RKZ6DnDZXMB6MhNQYJ8JoHojgzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 15, 2016 at 5:50 PM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> I still think that's solving the problem the wrong way. Why can't
>> exprType(), when applied to the Aggref, do something like this?
>>
>> {
>> Aggref *aref = (Aggref *) expr;
>> if (aref->aggpartial)
>> return aref->aggtranstype;
>> else
>> return aref->aggtype;
>> }
>>
>> The obvious answer is "well, because those fields don't exist in
>> Aggref". But shouldn't they? From here, it looks like PartialAggref
>> is a cheap hack around not having whacked Aggref around hard for
>> partial aggregation.
>
> We could do it that way if we left the aggpartial field out of the
> equals() check, but I think we go at length to not do that. Just look
> at what's done for all of the location fields. In any case if we did
> that then it might not actually be what we want all of the time...
> Perhaps in some cases we'd want equals() to return false when the
> aggpartial does not match, and in other cases we'd want it to return
> true. There's no way to control that behaviour, so to get around the
> setrefs.c problem I created the wrapper node type, which I happen to
> think is quite clean. Just see Tom's comments about Haribabu's temp
> fix for the problem where he put some hacks into the equals for aggref
> in [1].

I don't see why we would need to leave aggpartial out of the equals()
check. I must be missing something.

>>>> I don't see where this applies has_parallel_hazard or anything
>>>> comparable to the aggregate functions. I think it needs to do that.
>>>
>>> Not sure what you mean here.
>>
>> If the aggregate isn't parallel-safe, you can't do this optimization.
>> For example, imagine an aggregate function written in PL/pgsql that
>> for some reason writes data to a side table. It's
>> has_parallel_hazard's job to check the parallel-safety properties of
>> the functions used in the query.
>
> Sorry, I do know what you mean by that. I might have been wrong to
> assume that the parallelModeOK check did this. I will dig into how
> that is set exactly.

Hmm, sorry, I wasn't very accurate, there. The parallelModeOK check
will handle indeed the case where there are parallel-unsafe functions,
but it will not handle the case where there are parallel-restricted
functions. In that latter case, the query can still use parallelism
someplace, but the parallel-restricted functions cannot be executed
beneath the Gather.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arjen Nienhuis 2016-03-15 22:01:22 NOT LIKE index support
Previous Message Alvaro Herrera 2016-03-15 21:58:50 Re: pgbench stats per script & other stuff