Re: Use generation context to speed up tuplesorts

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Subject: Re: Use generation context to speed up tuplesorts
Date: 2022-03-22 15:08:44
Message-ID: CAApHDvqmXHKUi1CdLh9VVweSCgN_JvOiRpNZU1pg1rq6TEDyWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 23 Feb 2022 at 15:25, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2022-02-18 12:10:51 +1300, David Rowley wrote:
> > The other way I thought to fix it was by changing the logic for when
> > generation blocks are freed. In the problem case mentioned above, the
> > block being freed is the current block (which was just allocated). I
> > made some changes to adjust this behaviour so that we no longer free
> > the block when the final chunk is pfree()'d. Instead, that now lingers
> > and can be reused by future allocations, providing they fit inside it.
>
> That makes sense to me, as long as we keep just one such block.

I've rewritten the patch in such a way that it no longer matters which
block becomes free. What I did was add a new field to the
GenerationContext named "freeblock". We now simply store up to 1
recently emptied block there instead of calling free() on it. When
we're allocating new memory, we'll try and make use of the "freeblock"
when we don't have enough space in the current block.

I feel like this is quite a good change as it saves continual
free/malloc calls for FIFO pattern users of the generation context.
Since that's pretty much the workload that this context is intended
for, that seems like a very good change to make.

I did consider only recording a block as free once it reaches the
maximum size. As I have the code currently, we'll continually reuse
all blocks, including the initial sized ones. It will end up filling
blocks quicker as we'll be reusing those smaller initial blocks
continually with FIFO workloads. I'm not entirely sure if that
matters. I just wanted to point out that I thought about it.

Aside from the freeblock, I've also added code for adding a "keeper"
block. This is allocated in the same malloc'd chunk as the Generation
context itself. One thing I'm not too sure about is if the keeper
block change is worth the trouble. If we pfree() all of the chunks
out of the keeper block, there's a special case in GenerationFree() to
empty that block rather than free() it. This also means we need a
special case in GenerationAlloc() so we try to see if the keeper block
has any space before we go and allocate a new block. My current
thoughts are that the keeper block is really only useful for
Generation contexts that remain very small and don't ever require
allocation of additional blocks. This will mean all the memory can be
allocated along with the context, which saves a malloc and free call.
Does anyone have any thoughts on this?

> > The problem I see with this method is that there still could be some
> > pathological case that causes us to end up storing just a single tuple per
> > generation block.
>
> Crazy idea: Detect the situation, and recompact. Create a new context, copy
> all the tuples over, delete the old context. That could be a win even in less
> adversarial situations than "a single tuple per generation block".

I think that would need some new MemoryContext API functions in which
callers could use to determine the fragmentation and do something
about it on the consumer side. Otherwise, as Tomas says, if we did it
from within the context code itself we'd have no visibility of what is
pointing to the memory we're moving around.

If nobody has any objections to the 0001 patch then I'd like to move
ahead with it in the next few days. For the 0002 patch, I'm currently
feeling like we shouldn't be using the Generation context for bounded
sorts. The only way I can think to do that is with an API change to
tuplesort. I feel 0001 is useful even without 0002.

David

Attachment Content-Type Size
v4-0001-Improve-the-generation-memory-allocator.patch text/plain 21.6 KB
v4-0002-Use-Generation-memory-contexts-to-store-tuples-in.patch text/plain 2.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-03-22 15:14:44 Re: New Object Access Type hooks
Previous Message Peter Eisentraut 2022-03-22 15:01:17 Re: psql - add SHOW_ALL_RESULTS option