Re: make tuplestore helper function

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: make tuplestore helper function
Date: 2021-11-08 19:44:57
Message-ID: CAAKRu_amqL5Kkdkr0XP99br3AKVS8FFuAuZ3pRw=awT3SN4nHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The attached patch contains a helper now named
MakeFuncResultTuplestore() which aims to eliminate a few lines of
redundant code that all FUNCAPI functions making tuplestores repeated.

While investigating all of these call sites and making the changes
suggested by Álvaro, I noticed that the functions making tuplestores
broke out into seven groups. For some of them, using my helper changes
what memory context the tuple descriptor is made in.

Explanation of how/why memory context tuple descriptor is in has
changed:

An earlier version of this patch changed into the per query memory
context inside of the helper function. This version does not do so but
instead asserts that the caller has changed into the appropriate memory
context.

For callers which previously did not call get_call_result_type() and
instead used a function like CreateTupleDescCopy() to create the tuple
descriptor which was used for the tuples in the tuplestore, most of them
made this copy in the per query memory context. For these callers it
seemed error-prone to switch to the per query memory context to create
the copied tuple descriptor and then switch back before calling the
helper function. Instead, the helper function will assert that the
caller has already switched into the per query memory context so that
the tuplestore is made in the per query context.

This means that for the group of functions which, prior to this patch,
called get_call_result_type() in a shorter-lived memory context but now
rely on the helper function to call get_call_result_type(), the tuple
descriptor will now be in the per query memory context.

I think that this makes sense since most of these are set as the
ResultSetInfo->setDesc (the tuple descriptor for returned tuples).

Explanation of the different groups of functions using the
MakeFuncResultTuplestore() helper:

The first group are functions which called get_call_result_type(),
allocating memory for the tuple descriptor in the short-lived memory
context, then switched into the per query memory context before
creating the tuplestore, then switched back to the short-lived memory
context directly after making the tuplestore.

Now that they use the helper function and switch into the per query
memory context before calling it, the tuple descriptor is allocated in
the per query memory context.

The first group:

pg_stop_backup_v2()
pg_event_trigger_dropped_objects()
pg_event_trigger_ddl_commands()
pg_available_extensions()
pg_available_extension_versions()
pg_extension_update_paths()
pg_hba_file_rules()
pg_stat_get_subscription()
pg_logical_slot_get_changes_guts()
pg_show_replication_origin_status()
pg_get_replication_slots()
pg_stat_get_wal_senders()
pg_get_shmem_allocations()
pg_timezone_names()
pg_ls_dir_files()
pg_get_backend_memory_contexts()
pg_stat_get_progress_info()
pg_stat_get_activity()
pg_stat_get_slru()

The second group are functions which, instead of calling
get_call_result_type(), call CreateTemplateTupleDesc() and
TupleDescInitEntry() to allocate and initialize the tuple descriptor in
the per query memory context.
Their memory usage is unchanged by this patch.

The second group:

pg_prepared_statement()
pg_ls_dir()
pg_tablespace_databases()
show_all_file_settings()
pg_cursor()

The third and fourth groups are functions that use CreateTupleDescCopy()
to make a tuple descriptor copy from ResultSetInfo->expectedDesc.
These functions made a copy of ResultSetInfo->expectedDesc but did not
check if it was present before doing so (and do not seem to call a
function which would call internal_get_result_type() -- which does this
check), so I have added that check.
Their memory usage is unchanged by this patch.

The third group:

text_to_table()

The fourth group differs from the third in that the values for the
tuples are actually created in the per query memory context -- seemingly
unnecessarily since tuplestore_putvalues() will eventually end up
copying everything when it is forming tuples. This behavior is not
altered at all by the MakeFuncResultTuplestore() helper function patch,
however I wanted to mention that I noticed that perhaps more was being
allocated in the per query memory context than is necessary.

The fourth group:

pg_options_to_table()
pg_config()

The fifth group is a function that uses CreateTupleDescCoy() to make a
copy of a tuple descriptor (NOT ResultSetInfo->expectedDesc) in the
function info memory context (fcinfo->flinfo->fn_mcxt) so that it can be
accessed later.
Their memory usage is unchanged by this patch.

The fifth group:

populate_recordset_worker()

The sixth group of functions use CreateTupleDescCopy() to make a copy of
the tuple descriptor that they set up using get_call_result_type().

They called get_call_result_type() in a short-lived memory context, then
switched to the per query memory context and made a copy of it right
away. It is unclear why they couldn't just call get_call_result_type()
in the per query context and not make a copy. Perhaps they needed some
specific behavior of the copy (like that it doesn't copy constraints and
defaults?).

With the attached patch, the initial tuple descriptor is now made in the
per query memory context (as well as the copy). I wonder if the initial
tuple descriptor being made in the per query memory context negated the
need for a copy, however I left the copy intact since I wasn't sure.

These functions also checked for the presence of
ResultSetInfo->expectedDesc which is not applicable for all callers of
MakeFuncResultTuplestore(), so I separated it into a separate check.
get_call_result_type() will call internal_get_result_type() which will
check for expectedDesc if the result type is TYPEFUNC_RECORD, so I'm not
sure if the expectedDesc check is required or not in the body of these
functions.

The sixth group:

each_worker()
each_worker_jsonb()

The seventh group of functions use CreateTupleDescCopy() to make a copy of
ResultSetInfo->expectedDesc in the per query memory context.
This patch does not change any of that. The only difference is that now
the check for expectedDesc is separate.
Their memory usage is unchanged by this patch.

The seventh group:

elements_worker_jsonb()
elements_worker()

Note that for all of the functions in which I added or changed the check for
ResultSetInfo->expectedDesc, it is a little bit different because I've changed
the error code.
I've used error code: ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED where they
previously would have been part of an ereport with error code
ERRCODE_FEATURE_NOT_SUPPORTED.
I'm not sure if the new one is correct. In fact, I'm not sure if an error
related to expectedDesc being set should be user-facing at all (looking at how
expectedDesc is used and set, I wasn't sure). Perhaps it should be an elog() or
an assert?

Feedback is addressed inline below.

On Tue, Nov 2, 2021 at 2:17 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2021-Nov-02, Melanie Plageman wrote:
>
> > Attached is a patch to create a helper function which creates a
> > tuplestore to be used by FUNCAPI-callable functions.
>
> Looks good, and given the amount of code being removed, it seems a
> no-brainer that some helper(s) is/are needed.
>
> I think the name is a bit too generic. How about
> MakeFuncResultTuplestore()?

I've done this rename.

> I think the function should not modify rsinfo -- having that as
> (undocumented) side effect is weird. I would suggest to just return the
> tuplestore, and have the caller stash it in rsinfo->setResult and modify
> the other struct members alongside that. It's a couple more lines per
> caller, but the code is clearer that way.

I've changed the helper not to modify rsinfo.

> > There are a few places with very similar code to the new helper
> > (MakeTuplestore()) but which, for example, don't expect the return type
> > to be a row (pg_ls_dir(), pg_tablespace_databases(), text_to_table()). I
> > wasn't sure if it was worth modifying it for their benefit.
>
> Is this just because of the tupdesc? If that's the case, then maybe the
> tupdesc can be optionally passed in by caller, and when it's not, the
> helper does get_call_result_type() and the tupdesc is used as output
> argument.

I've parameterized the helper in this way and now am able to use it for
all FUNCAPI-callable functions making tuplestores except fmgr_sql (for
SQL functions) in functions.c because it didn't really seem worth it
given the way the code is written.

> > All callers of MakeTuplestore() pass work_mem as the third parameter, so
> > I could just omit it and hard-code it in, but I wasn't sure that felt
> > right in a utility/helper function.
>
> No opinion. I would have used the global in the function instead of
> passing it in.

I've changed this to use the global.

> > I inlined deflist_to_tuplestore() in foreign.c since it was billed as a
> > helper function but wasn't used elsewhere and it made it harder to use
> > MakeTuplestore() in this location.
>
> This is a bit strange. Does it work to pass rsinfo->expectedDesc?

This is addressed by changing MakeFuncResultTuplestore() call
get_call_result_type() optionally and having pg_options_to_table() call
it with NULL as tuple descriptor. Now the behavior is the same -- it
makes a copy of expectedDesc in the per query context and sets that as
setDesc in the ReturnSetInfo.

- Melanie

Attachment Content-Type Size
v2-0001-Add-helper-to-make-tuplestore.patch text/x-patch 47.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2021-11-08 19:52:28 Re: make tuplestore helper function
Previous Message Mark Dilger 2021-11-08 19:22:31 Re: CREATE ROLE IF NOT EXISTS