Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Cc: James Coleman <jtc331(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Date: 2021-07-12 13:11:17
Message-ID: CAApHDvp3a5Kv=6D+uADGUWrcTRC_QL9ZbRMqRi7h+PVKy3mPMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 7 Jul 2021 at 21:32, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> In the meantime I fixed some formatting issues, please find attached a new
> patch.

I started to look at this.

First I wondered how often we might be able to apply this
optimisation, so I ran make check after adding some elog(NOTICE) calls
to output which method is going to be used just before we do the
tuplestore_begin_* calls. It looks like there are 614 instances of
Datum sorts and 4223 of tuple sorts. That's about 14.5% datum sorts.
223 of the 614 are byval types and the other 391 are byref. Not that
the regression tests are a good reflection of the real world, but if
it were then that's quite a good number of cases to be able to
optimise.

As for the patch, just a few things:

1. Can you add the missing braces in this if condition and the else
condition that belongs to it.

+ if (node->is_single_val)
+ for (;;)
+ {

2. I think it would nicer to name the new is_single_val field
"datumSort" instead. To me it seems more clear what it is for.

3. This seems to be a bug fix where byval datum sorts do not properly
handle bounded sorts. I think that maybe that should be fixed and
backpatched. I don't see anything that says Datum sorts can't be
bounded and if there were some restriction on that I'd expect
tuplesort_set_bound() to fail when the Tuplesortstate had been set up
with tuplesort_begin_datum().

static void
free_sort_tuple(Tuplesortstate *state, SortTuple *stup)
{
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
+ /*
+ * If the SortTuple is actually only a single Datum, which was not copied
+ * as it is a byval type, do not try to free it nor account for it in
+ * memory used.
+ */
+ if (stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }

I can take this to another thread.

That's all I have for now.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-07-12 13:22:26 Is tuplesort meant to support bounded datum sorts?
Previous Message Tomas Vondra 2021-07-12 13:04:08 Re: pg_stats and range statistics