Re: Use generation context to speed up tuplesorts

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Use generation context to speed up tuplesorts
Date: 2022-04-01 09:00:08
Message-ID: CAApHDvpf00Dop=DjkbFVYdYBzvZwnmDfXxiLW5rAPDkB_9v41A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 23 Mar 2022 at 04:08, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> If nobody has any objections to the 0001 patch then I'd like to move
> ahead with it in the next few days. For the 0002 patch, I'm currently
> feeling like we shouldn't be using the Generation context for bounded
> sorts. The only way I can think to do that is with an API change to
> tuplesort. I feel 0001 is useful even without 0002.

0001:
I've made some small revisions to the 0001 patch so that the keeper
and freeblock are only used again when they're entirely empty. The
situation I want to avoid here is that when the current block does not
have enough free space to store some new allocation, that we'll then
try the freeblock and then keeper block. The problem I saw there is
that we may previously have partially filled the keeper or freeblock
block and have been unable to store some medium sized allocation which
caused us to move to a new block. If we didn't check that the keeper
and freeblock blocks were empty first then we could end up being able
to store some small allocation in there where some previous medium
sized allocation couldn't fit. While that's not necessarily an actual
problem, what it does mean is that consecutive allocations might not
end up in the same block or the next block. Over time in a FIFO type
workload it would be possible to get fragmentation, which could result
in being unable to free blocks. For longer lived contexts I imagine
that could end up fairly bad. The updated patch should avoid that
problem.

0002:
This modifies the tuplesort API so that instead of having a
randomAccess bool flag, this is changed to a bitwise flag that we can
add further options in the future. It's slightly annoying to break
the API, but it's not exactly going to be hard for people to code
around that. It might also mean we don't have to break the API in the
future if we're doing some change where we can just add a new bitwise
flag.

0003:
This adds a new flag for TUPLESORT_ALLOWBOUNDED and modifies the
places where we set a sort bound to pass the flag. The patch only
uses the generation context when the flag is not passed.

I feel this is a pretty simple patch and if nobody has any objections
then I plan to push all 3 of them on my New Zealand Monday morning.

David

Attachment Content-Type Size
v5-0001-Improve-the-generation-memory-allocator.patch text/plain 22.7 KB
v5-0002-Adjust-tuplesort-API-to-have-bitwise-flags-instea.patch text/plain 25.7 KB
v5-0003-Use-Generation-memory-contexts-to-store-tuples-in.patch text/plain 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2022-04-01 10:46:01 Re: pg_rewind copies
Previous Message Daniel Gustafsson 2022-04-01 09:00:00 Re: pg_rewind copies