Re: [HACKERS] [PATCH] Incremental sort

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Incremental sort
Date: 2017-11-19 21:24:00
Message-ID: CAEepm=1SMCcCL_DHMADJKZ9B9FDh6JEe+Vx+LTtdnfzELaPA1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 15, 2017 at 7:42 AM, Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> Sure, please find rebased patch attached.

+ /*
+ * Check if first "skipCols" sort values are equal.
+ */
+ static bool
+ cmpSortSkipCols(IncrementalSortState *node, TupleTableSlot *a,
+ TupleTableSlot *b)
+ {
+ int n, i;
+
+ Assert(IsA(node->ss.ps.plan, IncrementalSort));
+
+ n = ((IncrementalSort *) node->ss.ps.plan)->skipCols;
+
+ for (i = 0; i < n; i++)
+ {
+ Datum datumA, datumB, result;
+ bool isnullA, isnullB;
+ AttrNumber attno = node->skipKeys[i].attno;
+ SkipKeyData *key;
+
+ datumA = slot_getattr(a, attno, &isnullA);
+ datumB = slot_getattr(b, attno, &isnullB);
+
+ /* Special case for NULL-vs-NULL, else use standard comparison */
+ if (isnullA || isnullB)
+ {
+ if (isnullA == isnullB)
+ continue;
+ else
+ return false;
+ }
+
+ key = &node->skipKeys[i];
+
+ key->fcinfo.arg[0] = datumA;
+ key->fcinfo.arg[1] = datumB;
+
+ /* just for paranoia's sake, we reset isnull each time */
+ key->fcinfo.isnull = false;
+
+ result = FunctionCallInvoke(&key->fcinfo);
+
+ /* Check for null result, since caller is clearly not expecting one */
+ if (key->fcinfo.isnull)
+ elog(ERROR, "function %u returned NULL", key->flinfo.fn_oid);
+
+ if (!DatumGetBool(result))
+ return false;
+ }
+ return true;
+ }

Is there some reason not to use ApplySortComparator for this? I think
you're missing out on lower-overhead comparators, and in any case it's
probably good code reuse, no?

Embarrassingly, I was unaware of this patch and started prototyping
exactly the same thing independently[1]. I hadn't got very far and
will now abandon that, but that's one thing I did differently. Two
other things that may be different: I had a special case for groups of
size 1 that skipped the sorting, and I only sorted on the suffix
because I didn't put tuples with different prefixes into the sorter (I
was assuming that tuplesort_reset was going to be super efficient,
though I hadn't got around to writing that). I gather that you have
determined empirically that it's better to be able to sort groups of
at least MIN_GROUP_SIZE than to be able to skip the comparisons on the
leading attributes, but why is that the case?

[1] https://github.com/macdice/postgres/commit/ab0f8aff9c4b25d5598aa6b3c630df864302f572

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2017-11-19 21:38:47 Re: [HACKERS] [PATCH] A hook for session start
Previous Message Tom Lane 2017-11-19 21:11:50 Re: [HACKERS] pgbench regression test failure