Re: tuplesort_gettuple_common() and *should_free argument

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, hlinnaka <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: tuplesort_gettuple_common() and *should_free argument
Date: 2017-04-04 20:32:02
Message-ID: 20170404203202.m6rgxsegnxq54bgh@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-03-13 18:14:07 -0700, Peter Geoghegan wrote:
> From 5351b5db257cb39832647d9117465c0217e6268b Mon Sep 17 00:00:00 2001
> From: Peter Geoghegan <pg(at)bowt(dot)ie>
> Date: Thu, 13 Oct 2016 10:54:31 -0700
> Subject: [PATCH 1/2] Avoid copying within tuplesort_gettupleslot().

s/Avoid/Allow to avoid/

> Add a "copy" argument to make it optional to receive a copy of caller
> tuple that is safe to use following a subsequent manipulating of
> tuplesort's state. This is a performance optimization. Most existing
> tuplesort_gettupleslot() callers are made to opt out of copying.
> Existing callers that happen to rely on the validity of tuple memory
> beyond subsequent manipulations of the tuplesort request their own copy.
>
> This brings tuplesort_gettupleslot() in line with
> tuplestore_gettupleslot(). In the future, a "copy" tuplesort_getdatum()
> argument may be added, that similarly allows callers to opt out of
> receiving their own copy of tuple.
>
> In passing, clarify assumptions that callers of other tuplesort fetch
> routines may make about tuple memory validity, per gripe from Tom Lane.
> ---
> src/backend/executor/nodeAgg.c | 10 +++++++---
> src/backend/executor/nodeSort.c | 5 +++--
> src/backend/utils/adt/orderedsetaggs.c | 5 +++--
> src/backend/utils/sort/tuplesort.c | 28 +++++++++++++++++-----------
> src/include/utils/tuplesort.h | 2 +-
> 5 files changed, 31 insertions(+), 19 deletions(-)
>
> diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
> index aa08152..b650f71 100644
> --- a/src/backend/executor/nodeAgg.c
> +++ b/src/backend/executor/nodeAgg.c
> @@ -570,6 +570,10 @@ initialize_phase(AggState *aggstate, int newphase)
> * Fetch a tuple from either the outer plan (for phase 0) or from the sorter
> * populated by the previous phase. Copy it to the sorter for the next phase
> * if any.
> + *
> + * Callers cannot rely on memory for tuple in returned slot remaining valid
> + * past any subsequent manipulation of the sorter, such as another fetch of
> + * input from sorter. (The sorter may recycle memory.)
> */

It's not just the sorter, right? I'd just rephrase that the caller
cannot rely on tuple to stay valid beyond a single call.

> static bool
> tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
> @@ -2091,12 +2092,15 @@ tuplesort_gettuple_common(Tuplesortstate *state, bool forward,
> * NULL value in leading attribute will set abbreviated value to zeroed
> * representation, which caller may rely on in abbreviated inequality check.
> *
> - * The slot receives a copied tuple (sometimes allocated in caller memory
> - * context) that will stay valid regardless of future manipulations of the
> - * tuplesort's state.
> + * If copy is TRUE, the slot receives a copied tuple that will stay valid
> + * regardless of future manipulations of the tuplesort's state. Memory is
> + * owned by the caller. If copy is FALSE, the slot will just receive a
> + * pointer to a tuple held within the tuplesort, which is more efficient,
> + * but only safe for callers that are prepared to have any subsequent
> + * manipulation of the tuplesort's state invalidate slot contents.
> */

Hm. Do we need to note anything about how long caller memory needs to
live? I think we'd get into trouble if the caller does a memory context
reset, without also clearing the slot?

> bool
> -tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
> +tuplesort_gettupleslot(Tuplesortstate *state, bool forward, bool copy,
> TupleTableSlot *slot, Datum *abbrev)
> {
> MemoryContext oldcontext = MemoryContextSwitchTo(state->sortcontext);
> @@ -2113,8 +2117,10 @@ tuplesort_gettupleslot(Tuplesortstate *state, bool forward,
> if (state->sortKeys->abbrev_converter && abbrev)
> *abbrev = stup.datum1;
>
> - stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
> - ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, true);
> + if (copy)
> + stup.tuple = heap_copy_minimal_tuple((MinimalTuple) stup.tuple);
> +
> + ExecStoreMinimalTuple((MinimalTuple) stup.tuple, slot, copy);
> return true;

Other than these minimal adjustments, this looks good to go to me.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-04-04 20:42:39 bug in SlabAlloc / VALGRIND_MAKE_MEM_DEFINED
Previous Message Andres Freund 2017-04-04 20:19:15 Re: Logical decoding on standby