Re: Parallel Aggregate

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Paul Ramsey <pramsey(at)cleverelephant(dot)ca>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Aggregate
Date: 2016-03-16 01:23:53
Message-ID: CAKJS1f9672Jw0_JSAiGVj7ay17Q7yxEa220S_g-7ewQSoC5+nQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 March 2016 at 14:08, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Mar 15, 2016 at 8:55 PM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>> On 16 March 2016 at 13:42, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Tue, Mar 15, 2016 at 8:04 PM, David Rowley
>>> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
>>>> On 16 March 2016 at 12:58, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>>> ...and why would one be true and the other false?
>>>>
>>>> One would be the combine aggregate (having aggpartial = false), and
>>>> the one in the subnode would be the partial aggregate (having
>>>> aggpartial = true)
>>>> Notice in create_grouping_paths() I build a partial aggregate version
>>>> of the PathTarget named partial_group_target, this one goes into the
>>>> partial agg node, and Gather node. In this case the aggpartial will be
>>>> set differently for the Aggrefs in each of the two PathTargets, so it
>>>> would not be possible in setrefs.c to find the correct target list
>>>> entry in the subnode by using equal(). It'll just end up triggering
>>>> the elog(ERROR, "Aggref not found in subplan target list"); error.
>>>
>>> OK, I get it now. I still don't like it very much. There's no
>>> ironclad requirement that we use equal() here as opposed to some
>>> bespoke comparison function with the exact semantics we need, and ISTM
>>> that getting rid of PartialAggref would shrink this patch down quite a
>>> bit.
>>
>> Well that might work. I'd not thought of doing it that way. The only
>> issue that I can foresee with that is that when new fields are added
>> to Aggref in the future, we might miss updating that custom comparison
>> function to include them.
>
> That's true, but it doesn't seem like that big a deal. A code comment
> in the Aggref definition seems like sufficient insurance against such
> a mistake.
>
>> Should I update the patch to use the method you describe?
>
> Well, my feeling is that is going to make this a lot smaller and
> simpler, so I think so. But if you disagree strongly, let's discuss
> further.

Not strongly. It means that Aggref will need another field to store
the transtype and/or the serialtype (for the follow-on patches in
Combining Aggregates thread)
The only other issue which I don't think I've addressed yet is target
list width estimates. Probably that can just pay attention to the
aggpartial flag too, and if I fix that then likely the Aggref needs to
carry around a aggtransspace field too, which we really only need to
populate when the Aggref is in partial mode... it would be wasteful to
bother looking that up from the catalogs if we're never going to use
the Aggref in partial mode, and it seems weird to leave it as zero and
only populate it when we need it. So... if I put on my object oriented
hat and think about this.... My brain says that Aggref should inherit
from PartialAggref.... and that's what I have now... or at least some
C variant. So I really do think it's cleaner and easier to understand
by keeping the ParialAggref.

I see on looking at exprType() that we do have other node types which
conditionally return a different type, but seeing that does not fill
me with joy.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-03-16 01:51:56 Re: [PROPOSAL] Covering + unique indexes.
Previous Message Thomas Munro 2016-03-16 01:13:48 Re: Support for N synchronous standby servers - take 2