Parallelized polymorphic aggs, and aggtype vs aggoutputtype

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Date: 2016-06-16 20:31:14
Message-ID: 8229.1466109074@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I believe I've narrowed down the cause of this failure [1]:

regression=# explain SELECT min(col) FROM enumtest;
ERROR: type matched to anyenum is not an enum type: anyenum

The core reason for this seems to be that apply_partialaggref_adjustment
fails to consider the possibility that it's dealing with a polymorphic
transtype. It will happily label the partial aggregate with a polymorphic
aggoutputtype, which is broken in itself (since we won't know the
properties of the output type for purposes such as datum-copying). But
the above error comes out before we have any opportunity to crash from
that, because we then end up with the upper-level combining aggregate
having an input that is shown as being of polymorphic type, which means
it can't resolve what its transtype is supposed to be.

The business about having a (newly added) separate aggserialtype makes
this doubly dangerous, because if that applies we would be showing the
input to the combining aggregate as being of the serial type, which might
have *nothing* to do with the original input type, thereby breaking
polymorphic transtype resolution even further. But whether we're using
that or just the transtype, showing the input to the combining aggregate
as being of some type other than the original input is likely to break any
agg implementation function that uses get_call_expr_argtype or similar to
figure out what type it's supposed to be working on.

We might need to fix this by adding the resolved transtype to Aggref
nodes, rather than trying to re-resolve it at executor start. Some
tracing I did on the way to identifying this bug suggests that that
would save a fair amount of planner and executor-start overhead,
because we are calling get_aggregate_argtypes and thereby
enforce_generic_type_consistency half a dozen times per aggregate,
not to mention resolve_aggregate_transtype itself.

However, before trying to fix any of that, I'd like to demand an
explanation as to why aggoutputtype exists at all. It seems incredibly
confusing to have both that and aggtype, and certainly the code comments
give no hint of what the semantics are supposed to be when those fields
are different. I think aggoutputtype needs to go away again.

regards, tom lane

[1] https://www.postgresql.org/message-id/22782.1466100870@sss.pgh.pa.us

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Kellerer 2016-06-16 20:42:21 Re: proposal: integration bloat tables (indexes) to core
Previous Message Vik Fearing 2016-06-16 20:31:02 Re: proposal: integration bloat tables (indexes) to core