Quick Links

Re: Memory-Bounded Hash Aggregation

From:	Adam Lee <ali(at)pivotal(dot)io>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Memory-Bounded Hash Aggregation
Date:	2019-12-10 21:34:22
Message-ID:	20191210213422.GD1345@mars.local
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Dec 04, 2019 at 10:57:51PM -0800, Jeff Davis wrote:
> > About the `TODO: project needed attributes only` in your patch, when
> > would the input tuple contain columns not needed? It seems like
> > anything
> > you can project has to be in the group or aggregates.
>
> If you have a table like:
>
> CREATE TABLE foo(i int, j int, x int, y int, z int);
>
> And do:
>
> SELECT i, SUM(j) FROM foo GROUP BY i;
>
> At least from a logical standpoint, you might expect that we project
> only the attributes we need from foo before feeding them into the
> HashAgg. But that's not quite how postgres works. Instead, it leaves
> the tuples intact (which, in this case, means they have 5 attributes)
> until after aggregation and lazily fetches whatever attributes are
> referenced. Tuples are spilled from the input, at which time they still
> have 5 attributes; so naively copying them is wasteful.
>
> I'm not sure how often this laziness is really a win in practice,
> especially after the expression evaluation has changed so much in
> recent releases. So it might be better to just project all the
> attributes eagerly, and then none of this would be a problem. If we
> still wanted to be lazy about attribute fetching, that should still be
> possible even if we did a kind of "logical" projection of the tuple so
> that the useless attributes would not be relevant. Regardless, that's
> outside the scope of the patch I'm currently working on.
>
> What I'd like to do is copy just the attributes needed into a new
> virtual slot, leave the unneeded ones NULL, and then write it out to
> the tuplestore as a MinimalTuple. I just need to be sure to get the
> right attributes.
>
> Regards,
> Jeff Davis

Melanie and I tried this, had a installcheck passed patch. The way how
we verify it is composing a wide table with long unnecessary text
columns, then check the size it writes on every iteration.

Please check out the attachment, it's based on your 1204 version.

--
Adam Lee

Attachment	Content-Type	Size
spill_fewer_cols.patch	text/plain	5.5 KB

In response to

Re: Memory-Bounded Hash Aggregation at 2019-12-05 06:57:51 from Jeff Davis

Responses

Re: Memory-Bounded Hash Aggregation at 2019-12-22 23:05:46 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jens-Wolfhard Schicke-Uffmann	2019-12-10 21:44:17	Re: Contention on LWLock buffer_content, due to SHARED lock(?)
Previous Message	Alvaro Herrera	2019-12-10 21:12:47	Re: log bind parameter values on error