Re: WIP Patch for GROUPING SETS phase 1

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Subject: Re: WIP Patch for GROUPING SETS phase 1
Date: 2014-08-21 15:01:16
Message-ID: 87d2btrkdz.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Heikki" == Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:

Heikki> Uh, that's ugly. The EXPLAIN out I mean; as an implementation
Heikki> detail chaining the nodes might be reasonable. But the above
Heikki> gets unreadable if you have more than a few grouping sets.

It's good for highlighting performance issues in EXPLAIN, too.

4096 grouping sets takes about a third of a second to plan and execute,
but something like a minute to generate the EXPLAIN output. However,
for more realistic sizes, plan time is not significant and explain
takes only about 40ms for 256 grouping sets.

(To avoid resource exhaustion issues, we have set a limit of,
currently, 4096 grouping sets per query level. Without such a limit,
it is easy to write queries that would take TBs of memory to parse or
plan. MSSQL and DB2 have similar limits, I'm told.)

>> The ChainAggregate nodes use a tuplestore to communicate with the
>> GroupAggregate node at the top of the chain; they pass through input
>> tuples unchanged, and write aggregated result rows to the tuplestore,
>> which the top node then returns once it has finished its own result.

Heikki> Hmm, so there's a "magic link" between the GroupAggregate at
Heikki> the top and all the ChainAggregates, via the tuplestore. That
Heikki> may be fine, we have special rules in passing information
Heikki> between bitmap scan nodes too.

Eh. It's far from a perfect solution, but the planner doesn't lend itself
to perfect solutions.

Heikki> But rather than chain multiple ChainAggregate nodes, how
Heikki> about just doing all the work in the top GroupAggregate node?

It was easier this way. (How would you expect to do it all in the top
node when each subset of the grouping sets list needs to see the data
in a different order?)

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2014-08-21 15:05:54 Re: WIP Patch for GROUPING SETS phase 1
Previous Message Tom Lane 2014-08-21 15:00:25 Re: WIP Patch for GROUPING SETS phase 1