Re: New "single-call SRF" APIs are very confusingly named

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: New "single-call SRF" APIs are very confusingly named
Date: 2022-10-14 21:09:46
Message-ID: CAAKRu_adkse5PMvUb_B=mgiTDyuJGbb2=O32xFUV9ndLSA1vPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 13, 2022 at 3:48 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I unfortunately just noticed this now, just after we released...
>
> In
>
> commit 9e98583898c347e007958c8a09911be2ea4acfb9
> Author: Michael Paquier <michael(at)paquier(dot)xyz>
> Date: 2022-03-07 10:26:29 +0900
>
> Create routine able to set single-call SRFs for Materialize mode
>
>
> a new helper was added:
>
> #define SRF_SINGLE_USE_EXPECTED 0x01 /* use expectedDesc as tupdesc */
> #define SRF_SINGLE_BLESS 0x02 /* validate tuple for SRF
*/
> extern void SetSingleFuncCall(FunctionCallInfo fcinfo, bits32 flags);
>
>
> I think the naming here is very poor. For one, "Single" here conflicts
with
> ExprSingleResult which indicates "expression does not return a set",
> i.e. precisely the opposite what SetSingleFuncCall() is used for. For
another
> the "Set" in SetSingleFuncCall makes it sound like it's function setting a
> property.
>
> Even leaving the confusion with ExprSingleResult aside, calling it
"Single"
> still seems very non-descriptive. I assume it's named to contrast with
> init_MultiFuncCall() etc. While those are also not named greatly, they're
not
> typically used directly but wraped in SRF_FIRSTCALL_INIT etc.

So, while I agree that the "Single" in SetSingleFuncCall() could be
confusing given the name of ExprSingleResult, I feel like actually all
of the names are somewhat wrong.

ExprSingleResult, ExprMultipleResult, and ExprEndResult are used as
values of ReturnSetInfo->isDone, used in value-per-call mode to indicate
whether or not a given value is the last or not. The comment on
ExprSingleResult says it means "expression does not return a set",
however, in Materialize mode (which is for functions returning a set),
isDone is supposed to be set to ExprSingleResult.

Take this code in ExecMakeFunctionResultSet()

else if (rsinfo.returnMode == SFRM_Materialize)
{
/* check we're on the same page as the function author */
if (rsinfo.isDone != ExprSingleResult)

So, isDone is used for a different purpose in value-per-call and
materialize modes (and with pretty contradictory definitions) which is
pretty confusing.

Besides that, it is not clear to me that ExprMultipleResult conveys that
the result is a member of or an element of a set. Perhaps it should be
ExprSetMemberResult and instead of using ExprSingleResult for
Materialize mode there should be another enum value that indicates "not
used" or "materialize mode". It could even be ExprSetResult -- since the
whole result is a set. Though that may be confusing since isDone is not
used for Materialize mode except to ensure "we're on the same page as
the function author".

Expr[Single|Multiple]Result aside, I do see how SINGLE/Single when used
for a helper function that does set up for SFRM_Materialize mode
functions is confusing.

The routines for SFRM_ValuePerCall all use multi, so I don't think it
was unreasonable to use single. However, I agree it would be better to
use something else (probably materialize).

The different dimensions requiring distinction are:
- returns a set (Y/N)
- called multiple times to produce a single result (Y/N)
- builds a tuplestore for result set (Y/N)

SFRM_Materialize comment says "result set instantiated in Tuplestore" --
So, I feel like the question is, does a function which returns its
entire result set in a single invocation have to do so using a
tuplestore and does one that returns part of its result set on each
invocation have to do so without a tuplestore (does each invocation have
to return a scalar or tuple)?

The current implementation may not support it, but it doesn't seem like
using a tuplestore and returning all elements of the result set vs some
of them in one invocation are alternatives.

It might be better if the SetFunctionReturnMode stuck to distinguishing
between functions returning their entire result in one invocation or
part of their result in one invocation.

That being said, the current SetSingleFuncCall() makes the tuplestore
and ensures the TupleDesc required by Materialize mode is set or
created. Since it seems only to apply to Materialize mode, I am in favor
of using "materialize" instead of "single".

> Maybe something like InitMaterializedSRF() w/
> MAT_SRF_(USE_EXPECTED_DESC|BLESS)

I also agree that starting the function name with Set isn't the best. I
like InitMaterializedSRF() and MAT_SRF_USE_EXPECTED_TUPLE_DESC. Are there
other kinds of descs?

Also, "single call" and "multi call" are confusing because they kind of
seem like they are describing a behavior of the function limiting the
number of times it can be called. Perhaps the multi* function names
could eventually be renamed something to convey how much of a function's
result can be expected to be produced on an invocation.

To summarize, I am in support of renaming SetSingleFuncCall() ->
InitMaterializedSRF() and SRF_SINGLE_USE_EXPECTED ->
MAT_SRF_USE_EXPECTED_TUPLE_DESC (or just DESC) as suggested elsewhere in
this thread. And I think we should eventually consider renaming the
multi* function names and consider if ExprSingleResult is a good name.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-10-14 21:32:50 Re: New "single-call SRF" APIs are very confusingly named
Previous Message Imseih (AWS), Sami 2022-10-14 20:05:41 Re: Add index scan progress to pg_stat_progress_vacuum