Re: Use generation context to speed up tuplesorts

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tv(at)fuzzy(dot)cz>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use generation context to speed up tuplesorts
Date: 2021-12-31 21:26:37
Message-ID: CAApHDvrTNYtHZUjUbYbDJ-hPWNbotnCXr2pN7+EJURirz7urPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 10 Sept 2021 at 01:38, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> I've spent quite a bit of time investigating the regressions clearly
> visible in the benchmark results for some allocation/free patterns.
> Attached is a subset of results from a recent run, but the old results
> show mostly the same thing.

I started looking at the original patch again and wondered if it might
be a good idea to allow the estimates from the planner about the tuple
size and the number of tuples to help determine the size of the chunks
in the generation context. The number of tuples can maybe also help
size the array that we're sorting, which would save us allocating just
1024 elements and doubling it each time we run out of space in the
array.

The main problem with that is that there will be many cases where we
need to sort but don't have any estimates from the planner, so if we
do this, then likely, we'll still need the chunk growing code.

I've attached some benchmark results that I took recently. The
spreadsheet contains results from 3 versions. master, master + 0001 -
0002, then master + 0001 - 0003. The 0003 patch makes the code a bit
more conservative about the chunk sizes it allocates and also tries to
allocate the tuple array according to the number of tuples we expect
to be able to sort in a single batch for when the sort is not
estimated to fit inside work_mem.

For the 4MB work_mem test in the spreadsheet, this does end up using
2MB chunks rather than 4MB chunks. This is because the 10k tuple test
is just so close to requiring 4MB of memory to perform the sort. This
does mean that for the 10k tuple test, the 0003 patch makes things
look quite a bit slower than just with the 0001 + 0002 patch. However,
I think not making the chunk size the same size as work_mem is a good
idea.

The patch is still WIP. I still have a few magic numbers in there,
for example the 48 in:

state->est_tuple_memory = pg_nextpower2_64(state->est_tuples *
(state->est_tupwidth + 48));

I should really have a sizeof() in there. I've not really looked for
other opportunities other than in nodeSort.c for where I can pass down
tuple estimates down into tuplesort.c

David

Attachment Content-Type Size
generation context tuplesort.ods application/vnd.oasis.opendocument.spreadsheet 85.8 KB
v2-0002-Use-tuple-estimate-to-size-tuple-array.patch application/octet-stream 8.5 KB
v2-0003-Improve-heuristics-for-memory-allocation-sizes-in.patch application/octet-stream 10.7 KB
v2-0001-Use-a-generation-context-for-tuple-storage-to-spe.patch application/octet-stream 8.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-12-31 22:06:42 Collecting statistics about contents of JSONB columns
Previous Message Andres Freund 2021-12-31 19:25:28 slowest tap tests - split or accelerate?