Re: Memory-Bounded Hash Aggregation

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Adam Lee <ali(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2020-02-04 23:08:11
Message-ID: CAH2-WznSae1vd5f0t0NO2FAMm2+52BMHaL_JbzvmB30Y50FUKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 3, 2020 at 6:24 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> And now I'm attaching another version of the main Hash Aggregation
> patch to be applied on top of the logtape.c patch.

Have you tested this against tuplesort.c, particularly parallel CREATE
INDEX? It would be worth trying to measure any performance impact.
Note that most parallel CREATE INDEX tuplesorts will do a merge within
each worker, and one big merge in the leader. It's much more likely to
have multiple passes than a regular serial external sort.

Parallel CREATE INDEX is currently accidentally disabled on the master
branch. That should be fixed in the next couple of days. You can
temporarily revert 74618e77 if you want to get it back for testing
purposes today.

Have you thought about integer overflow in your heap related routines?
This isn't as unlikely as you might think. See commit 512f67c8, for
example.

Have you thought about the MaxAllocSize restriction as it concerns
lts->freeBlocks? Will that be okay when you have many more tapes than
before?

> Not a lot of changes from the last version; mostly some cleanup and
> rebasing. But it's faster now with the logtape.c changes.

LogicalTapeSetExtend() seems to work in a way that assumes that the
tape is frozen. It would be good to document that assumption, and
possible enforce it by way of an assertion. The same remark applies to
any other assumptions you're making there.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-02-04 23:53:27 Re: Memory-Bounded Hash Aggregation
Previous Message James Sewell 2020-02-04 22:43:00 Re: Minimal logical decoding on standbys