From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
Cc: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Custom tuplesorts for extensions |
Date: | 2022-07-06 15:01:39 |
Message-ID: | 20220706150139.sidv4zcu7ipcklth@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I think this needs to be evaluated for performance...
I agree with the nearby comment that the commits need a bit of justification
at least to review them.
On 2022-06-23 15:12:27 +0300, Maxim Orlov wrote:
> From 03b78cdade3b86a0e97723721fa1d0bd64d0c7df Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
> Date: Tue, 21 Jun 2022 13:28:27 +0300
> Subject: [PATCH v2 1/6] Remove Tuplesortstate.copytup
Yea. I was recently complaining about the pointlessness of copytup.
> From 1d78e271b22d7c6a1557defbe15ea5039ff28510 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
> Date: Tue, 21 Jun 2022 14:03:13 +0300
> Subject: [PATCH v2 2/6] Tuplesortstate.getdatum1 method
Hm. This adds a bunch of indirect function calls were there previously
weren't.
> From 494d46dcf938e5f824a498e38ce77782751208e1 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
> Date: Tue, 21 Jun 2022 14:13:56 +0300
> Subject: [PATCH v2 3/6] Put abbreviation logic into puttuple_common()
There's definitely a lot of redundancy removed... But the list of branches in
puttuple_common() grew. Perhaps we instead can add a few flags to
putttuple_common() that determine whether abbreviation should happen, so that
we only do the work necessary for the "type" of sort?
> From ee2dd46b07d62e13ed66b5a38272fb5667c943f3 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
> Date: Wed, 22 Jun 2022 00:14:51 +0300
> Subject: [PATCH v2 4/6] Move freeing memory away from writetup()
Seems to do more than just moving freeing around?
> @@ -1973,8 +1963,13 @@ tuplesort_putdatum(Tuplesortstate *state, Datum val, bool isNull)
> static void
> puttuple_common(Tuplesortstate *state, SortTuple *tuple)
> {
> + MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
> +
> Assert(!LEADER(state));
>
> + if (tuple->tuple != NULL)
> + USEMEM(state, GetMemoryChunkSpace(tuple->tuple));
> +
Adding even more branches into common code...
> From 3a0e1fa7c7e4da46a86f7d5b9dd0392549f3b460 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
> Date: Wed, 22 Jun 2022 18:11:26 +0300
> Subject: [PATCH v2 5/6] Reorganize data structures
Hard to know what this is trying to achieve.
> -struct Tuplesortstate
> +struct TuplesortOps
> {
> - TupSortStatus status; /* enumerated value as shown above */
> - int nKeys; /* number of columns in sort key */
> - int sortopt; /* Bitmask of flags used to setup sort */
> - bool bounded; /* did caller specify a maximum number of
> - * tuples to return? */
> - bool boundUsed; /* true if we made use of a bounded heap */
> - int bound; /* if bounded, the maximum number of tuples */
> - bool tuples; /* Can SortTuple.tuple ever be set? */
> - int64 availMem; /* remaining memory available, in bytes */
> - int64 allowedMem; /* total memory allowed, in bytes */
> - int maxTapes; /* max number of input tapes to merge in each
> - * pass */
> - int64 maxSpace; /* maximum amount of space occupied among sort
> - * of groups, either in-memory or on-disk */
> - bool isMaxSpaceDisk; /* true when maxSpace is value for on-disk
> - * space, false when it's value for in-memory
> - * space */
> - TupSortStatus maxSpaceStatus; /* sort status when maxSpace was reached */
> MemoryContext maincontext; /* memory context for tuple sort metadata that
> * persists across multiple batches */
> MemoryContext sortcontext; /* memory context holding most sort data */
> MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */
> - LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */
>
> /*
> * These function pointers decouple the routines that must know what kind
To me it seems odd to have memory contexts and similar things in a
datastructure calls *Ops.
> From b06bcb5f3666f0541dfcc27c9c8462af2b5ec9e0 Mon Sep 17 00:00:00 2001
> From: Alexander Korotkov <akorotkov(at)postgresql(dot)org>
> Date: Wed, 22 Jun 2022 21:48:05 +0300
> Subject: [PATCH v2 6/6] Split tuplesortops.c
I strongly suspect this will cause a slowdown. There was potential for
cross-function optimization that's now removed.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Matthias van de Meent | 2022-07-06 15:28:55 | Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog |
Previous Message | Tom Lane | 2022-07-06 14:48:21 | Re: Use outerPlanState macro instead of referring to leffttree |