From: | Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Inline non-SQL SRFs using SupportRequestSimplify |
Date: | 2025-08-08 21:46:01 |
Message-ID: | CA+renyXxanvX8rnrqTNkZt1r9=rm59MHSG4DDVfsV2aUGe3dUQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 14, 2025 at 2:21 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I got around to looking at this again. I generally agree with your
> approach to the refactoring in clauses.c, with minor nitpicks:
Thanks for taking another look! Revisions attached.
> * I don't like postponing the early exit for its-not-a-SELECT;
> as coded, this wastes a pretty decent number of cycles transforming
> a querytree that won't be used (not to mention that I'm not sure
> that our usage of check_sql_fn_retval won't fail on a non-SELECT).
> So I think we should keep this bit where it is:
>
> - /*
> - * The single command must be a plain SELECT.
> - */
> - if (!IsA(querytree, Query) ||
> - querytree->commandType != CMD_SELECT)
> - goto fail;
>
> and then in the other path, simply Assert that those two conditions
> hold for anything the support function might try to give back.
Okay.
> * I'm inclined to think that the test for "it must be declared to
> return a set" should stay in inline_sql_set_returning_function.
> In the case of a support-function-supported function, it's okay either
> if the function returns a set or if it is guaranteed to return exactly
> one row (including edge cases such as null function arguments).
> The support function either knows that already or can take the
> responsibility for checking it. But if we do it like this, we
> foreclose the possibility of supporting the latter class of functions.
Makes sense. Done.
> * But on the other hand, I wonder if this bit shouldn't move to
> the outer function:
>
> /*
> * Refuse to inline if the arguments contain any volatile functions or
> * sub-selects. Volatile functions are rejected because inlining may
> * result in the arguments being evaluated multiple times, risking a
> * change in behavior. Sub-selects are rejected partly for implementation
> * reasons (pushing them down another level might change their behavior)
> * and partly because they're likely to be expensive and so multiple
> * evaluation would be bad.
> */
> if (contain_volatile_functions((Node *) fexpr->args) ||
> contain_subplans((Node *) fexpr->args))
> return NULL;
>
> I am not really convinced that any support function could safely
> ignore those restrictions, and I do fear that a lot would omit the
> enforcement and thereby produce wrong queries in such cases. Another
> thing that likely needs to be in the outer wrapper is the check that
> pg_proc_proconfig is empty, because that doesn't seem like a case
> that support functions could skip over either.
Agreed, done.
> * I don't like the memory management. I think creation/destruction
> of the temp context should occur at the outer level, and in particular
> that we want to perform substitute_actual_srf_parameters() while still
> working in the temp context, and copy out only the final form of the
> query tree. This addresses your XXX comment in v2-0002, and also
> saves support functions from having to re-invent that wheel.
Okay, done. I was trying to defer creating a memory context past as
many checks as possible. It's not an expensive thing to do?
> > I didn't love passing a SysCache HeapTuple into another function,
>
> No, that's perfectly common; see for example
> prepare_sql_fn_parse_info. In fact, one thing I don't like in v2-0002
> is that you should pass the pg_proc entry to the support function as a
> HeapTuple not Form_pg_proc. It's possible to get the Form_pg_proc
> pointer from the HeapTuple but not vice versa, while the Form_pg_proc
> does not allow access to varlena fields, which makes it useless for
> many cases. Even your own example function is forced to re-fetch
> the syscache entry because of this.
Okay, thanks for explaining! Done.
> One other comment on v2-0002 is that this bit doesn't look right:
>
> + /* Get filter if present */
> + node = lthird(expr->args);
> + if (!(IsA(node, Const) && ((Const *) node)->constisnull))
> + {
> + appendStringInfo(&sql, " WHERE %s::text = $3", quote_identifier(colname));
> + }
>
> It's not actually doing anything with the "node" value.
This is correct, but I added a comment. The idea is that $3 will get
the value of "node".
> Backing up to a higher level, it seems like we still have no answer
> for how to build a valid support function result besides "construct an
> equivalent SQL query string and feed it through parse analysis and
> rewrite". That seems both restrictive and expensive. In particular
> it begs the question of why the target function couldn't just have
> been written as a SQL function to begin with. So I still have kind
> of a low estimate of this feature's usefulness. Is there a way to
> do better?
Parsing the function body is no more expensive than what we'd do to
execute it separately, right? And by inlining we only do it once. But
if it was too much for someone, perhaps they could keep a cache of
node trees based on the function arguments and/or hash of the SQL
string, not unlike how foreign keys cache query plans. Or they could
build the node tree directly, without parsing. But parsing the
equivalent string is easy and covers most uses. I can even imagine a
way to eventually semi-automate it, e.g. creating a general-purpose
support function that you can attach to (many) PL/pgSQL functions that
end in `EXECUTE format(...)`.
The reason for supporting more than SQL functions is to let you
construct the query dynamically, e.g. with user-supplied table/column
names, or to only include some expensive filters if needed. This would
be great for building functions that implement temporal
outer/semi/antijoin. Another use-case I personally have, which I think
is quite common, is building "parameterized views" for permissions
checks, e.g. visible_sales(user). In that case we may only need to
include certain joins if the user belongs to certain roles (e.g. a
third-party sales rep).
Rebased to 04b7ff3cd3.
Yours,
--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Move-some-things-outside-of-inline_set_returning_.patch | application/octet-stream | 12.9 KB |
v3-0002-Add-SupportRequestInlineSRF.patch | application/octet-stream | 27.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2025-08-08 21:59:01 | Re: Add CHECK_FOR_INTERRUPTS in pg_buffercache_pages while scanning the buffers |
Previous Message | Jacob Champion | 2025-08-08 21:31:49 | Re: [PATCH] OAuth: fix performance bug with stuck multiplexer events |