Re: [HACKERS] [PATCH] Incremental sort

From: "Tels" <nospam-pg-abuse(at)bloodgate(dot)com>
To: "Alexander Korotkov" <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: "Antonin Houska" <ah(at)cybertec(dot)at>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Incremental sort
Date: 2018-01-04 23:21:28
Message-ID: 1505c7a5354ff0fb58eb8f1c43545b92.squirrel@sm.webmail.pair.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hello Alexander,

On Thu, January 4, 2018 4:36 pm, Alexander Korotkov wrote:
> On Fri, Dec 8, 2017 at 4:06 PM, Alexander Korotkov <
> a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>
>> Thank you for pointing that. Sure, both cases are better. I've added
>> second case as well as comments. Patch is attached.

I had a quick look, this isn't a full review, but a few things struck me
on a read through the diff:

There are quite a few places where lines are broken like so:

+ ExecIncrementalSortInitializeWorker((IncrementalSortState *) planstate,
+ pwcxt);
Or like this:

+ result = (PlanState *) ExecInitIncrementalSort(
+ (IncrementalSort *) node, estate, eflags);

e.g. a param is on the next line, but aligned to the very same place where
it would be w/o the linebreak. Or is this just some sort of artefact
because I viewed the diff with tabspacing = 8?

I'd fix the grammar here:

+ * Incremental sort is specially optimized kind of multikey sort when
+ * input is already presorted by prefix of required keys list.

Like so:

"Incremental sort is a specially optimized kind of multikey sort used when
the input is already presorted by a prefix of the required keys list."

+ * Consider following example. We have input tuples consisting from

"Consider the following example: We have ..."

+ * In incremental sort case we also have to cost to detect sort groups.

"we also have to cost the detection of sort groups."

"+ * It turns out into extra copy and comparison for each tuple."

"This turns out to be one extra copy and comparison per tuple."

+ "Portions Copyright (c) 1996-2017"

Should probably be 2018 now - time flies fast :)

return_value = _readMaterial();
else if (MATCH("SORT", 4))
return_value = _readSort();
+ else if (MATCH("INCREMENTALSORT", 7))
+ return_value = _readIncrementalSort();
else if (MATCH("GROUP", 5))
return_value = _readGroup();

I think the ", 7" here is left-over from when it was named "INCSORT", and
it should be MATCH("INCREMENTALSORT", 15)), shouldn't it?

+ space, fase when it's value for in-memory

typo: "space, false when ..."

+ bool cmp;
+ cmp = cmpSortSkipCols(node, node->sampleSlot, slot);
+
+ if (cmp)

In the above, the variable cmp could be optimized away with:

+ if (cmpSortSkipCols(node, node->sampleSlot, slot))

(not sure if modern compilers won't do this, anway, though)

+typedef struct IncrementalSortState
+{
+ ScanState ss; /* its first field is NodeTag */
+ bool bounded; /* is the result set
bounded? */
+ int64 bound; /* if bounded, how many
tuples are needed */

If I'm not wrong, the layout of the struct will include quite a bit of
padding on 64 bit due to the mixing of bool and int64, maybe it would be
better to sort the fields differently, e.g. pack 4 or 8 bools together?
Not sure if that makes much of a difference, though.

That's all for now :)

Thank you for your work,

Tels

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tels 2018-01-04 23:24:53 Re: Faster inserts with mostly-monotonically increasing values
Previous Message Haribabu Kommi 2018-01-04 23:20:32 Re: [HACKERS] Pluggable storage