Re: Final Patch for GROUPING SETS - unrecognized node type: 347

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Final Patch for GROUPING SETS - unrecognized node type: 347
Date: 2014-09-06 21:34:15
Message-ID: 874mwka4hd.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tomas" == Tomas Vondra <tv(at)fuzzy(dot)cz> writes:

Tomas> I have significant doubts about the whole design,
Tomas> though. Especially the decision not to use HashAggregate,

There is no "decision not to use HashAggregate". There is simply no
support for HashAggregate yet.

Having it be able to work with GroupAggregate is essential, because
there are always cases where HashAggregate is simply not permitted
(e.g. when using distinct or sorted aggs; or unhashable types; or with
the current code, when the estimated memory usage exceeds work_mem).
HashAggregate may be a performance improvement, but it's something
that can be added afterwards rather than an essential part of the
feature.

Tomas> Now, the chaining only makes this worse, because it
Tomas> effectively forces a separate sort of the whole table for each
Tomas> grouping set.

It's not one sort per grouping set, it's the minimal number of sorts
needed to express the result as a union of ROLLUP clauses. The planner
code will (I believe) always find the smallest number of sorts needed.

Each aggregate node can process any number of grouping sets as long as
they represent a single rollup list (and therefore share a single sort
order).

Yes, this is slower than using one hashagg. But it solves the general
problem in a way that does not interfere with future optimization.

(HashAggregate can be added to the current implementation by first
adding executor support for hashagg with multiple grouping sets, then
in the planner, extracting as many hashable grouping sets as possible
from the list before looking for rollup lists. The chained aggregate
code can work just fine with a HashAggregate as the chain head.

We have not actually tackled this, since I'm not going to waste any
time adding optimizations before the basic idea is accepted.)

Tomas> What I envisioned when considering hacking on this a few
Tomas> months back, was extending the aggregate API with "merge
Tomas> state" function,

That's not really on the cards for arbitrary non-trivial aggregate
functions.

Yes, it can be done for simple ones, and if you want to use that as a
basis for adding optimizations that's fine. But a solution that ONLY
works in simple cases isn't sufficient, IMO.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-09-06 21:39:21 Re: Patch for psql History Display on MacOSX
Previous Message Tomas Vondra 2014-09-06 20:04:46 Re: Final Patch for GROUPING SETS - unrecognized node type: 347