Spilling hashed SetOps and aggregates to disk

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Spilling hashed SetOps and aggregates to disk
Date: 2018-06-04 08:32:47
Message-ID: 87be3bd5-6b13-d76e-5618-6db0a4db584d@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Hash Aggs and SetOps are currently not spilled to disk. If the planner's
estimate on the number of entries is badly off, you might run out of
memory at execution time, if all the entries don't fit in memory.

For HashAggs, this was discussed in depth a couple of years ago at [1].
SetOps have the same issue, but fixing that is simpler, as you don't
need to handle arbitrary aggregate transition values and functions.

So a while back, I started hacking on spilling SetOps, with the idea
that the code to deal with that could later be reused to deal with
HashAggs, too. I didn't get very far, but I'm posting this in this very
unfinished form to show what I've got, because I had a chat on this with
Jeff Davis and some others last week.

The logtape.c interface would be very useful for this. When you start
spilling, you want to create many spill files, so that when reloaded,
each file will fit comfortably in memory. With logtape.c, you can have
many logical tapes, without the overhead of real files. Furthermore, if
you need to re-spill because you a spill file grows too big in the first
pass, logtape.c allows reusing the space "on-the-fly". The only problem
with the current logtape interface is that it requires specifying the
number of "tapes" upfront, when the tapeset is created. However, I was
planning to change that, anyway [2].

[1]
https://www.postgresql.org/message-id/1407706010.6623.16.camel%40jeff-desktop

[2]
https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi

- Heikki

Attachment Content-Type Size
0001-Optimize-memory-usage-in-SetOp-executor-node.patch text/x-patch 5.2 KB
0002-Allow-SetOps-to-spill.patch text/x-patch 25.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-06-04 09:17:03 Re: plans for PostgreSQL 12
Previous Message Vik Fearing 2018-06-04 07:59:11 Re: plans for PostgreSQL 12