Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Date: 2016-06-16 22:40:10
Message-ID: 14538.1466116810@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jun 16, 2016 at 4:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> However, before trying to fix any of that, I'd like to demand an
>> explanation as to why aggoutputtype exists at all. It seems incredibly
>> confusing to have both that and aggtype, and certainly the code comments
>> give no hint of what the semantics are supposed to be when those fields
>> are different. I think aggoutputtype needs to go away again.

> fix_combine_agg_expr, or more search_indexed_tlist_for_partial_aggref,
> needs to be able to match a finalize-aggregate node type to a
> partial-aggregate node type beneath it.

Meh. I think this is probably telling us that trying to repurpose Aggref
as the representation of a partial aggregate may have been mistaken.
Or maybe just that fix_combine_agg_expr was a bad idea. It seems likely
to me that that could have been dispensed with altogether in return for
slightly more work in create_grouping_paths, for instance transforming
the upper-level Aggrefs into something that looks more like
Aggref(PartialAggref(original-arguments))
whereupon the pre-existing match logic should be sufficient to replace
the "PartialAggref(original-arguments)" subtree with a suitable Var
in the upper aggregation plan node.

> aggtype is one of the fields
> that gets compared as part of that process, and it won't be the same
> if you make aggtype mean the result of that particular node rather
> than the result of the aggregate. nodeAgg.c also uses aggtype for
> matching purposes; not sure if there is a similar problem there or
> not.

AFAICS, nodeAgg.c's check on that is not logically necessary: if you have
identical inputs and the same aggregate function then you should certainly
expect the same output type. The only reason we're making it is that the
code originally used equal() to detect identical aggregates, and somebody
slavishly copied all the comparisons when splitting up that test so as to
distinguish "identical inputs" from "identical aggregates". I'll reserve
judgement on whether search_indexed_tlist_for_partial_aggref has any idea
what it's doing in this regard.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2016-06-17 00:34:26 Re: 10.0
Previous Message Robert Haas 2016-06-16 22:10:37 Re: [COMMITTERS] pgsql: Update pg_stat_statements extension for parallel query.