Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

From: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Date: 2021-07-12 13:59:42
Message-ID: 6987360.ZhFIMvCLOo@aivenronan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le lundi 12 juillet 2021, 15:11:17 CEST David Rowley a écrit :
> On Wed, 7 Jul 2021 at 21:32, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> > In the meantime I fixed some formatting issues, please find attached a new
> > patch.
>
> I started to look at this.

Thank you ! I'm attaching a new version of the patch taking your remarks into
account.
>
> First I wondered how often we might be able to apply this
> optimisation, so I ran make check after adding some elog(NOTICE) calls
> to output which method is going to be used just before we do the
> tuplestore_begin_* calls. It looks like there are 614 instances of
> Datum sorts and 4223 of tuple sorts. That's about 14.5% datum sorts.
> 223 of the 614 are byval types and the other 391 are byref. Not that
> the regression tests are a good reflection of the real world, but if
> it were then that's quite a good number of cases to be able to
> optimise.

That's an interesting stat.

>
> As for the patch, just a few things:
>
> 1. Can you add the missing braces in this if condition and the else
> condition that belongs to it.
>
> + if (node->is_single_val)
> + for (;;)
> + {
>

Done.

> 2. I think it would nicer to name the new is_single_val field
> "datumSort" instead. To me it seems more clear what it is for.

Done.

>
> 3. This seems to be a bug fix where byval datum sorts do not properly
> handle bounded sorts. I think that maybe that should be fixed and
> backpatched. I don't see anything that says Datum sorts can't be
> bounded and if there were some restriction on that I'd expect
> tuplesort_set_bound() to fail when the Tuplesortstate had been set up
> with tuplesort_begin_datum().

I've kept this as-is for now, but I will remove it from my patch if it is
deemed worthy of back-patching in your other thread.

Regards,

--
Ronan Dunklau

Attachment Content-Type Size
v5-0001-Allow-Sort-nodes-to-use-the-fast-single-datum-tuples.patch text/x-patch 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2021-07-12 14:30:41 Re: ATTACH PARTITION locking documentation for DEFAULT partitions
Previous Message Tom Lane 2021-07-12 13:38:48 Re: enable_resultcache confusion