Re: Error-safe user functions

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Error-safe user functions
Date: 2022-12-02 19:06:09
Message-ID: CADkLM=cvLMbPSOCsiSATuwbDx3mOn1tH3uifm+fZ=ro_xOA=Lw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 2, 2022 at 1:46 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Corey Huinker <corey(dot)huinker(at)gmail(dot)com> writes:
> > I'm still working on organizing my patch, but it grew out of a desire to
> do
> > this:
> > CAST(value AS TypeName DEFAULT expr)
> > This is a thing that exists in other forms in other databases and while
> it
> > may look unrelated, it is essentially the SQL/JSON casts within a nested
> > data structure issue, just a lot simpler.
>
> Okay, maybe that's why I was thinking we had a requirement for
> failure-free casts. Sure, you can transform it to the other thing
> by always implementing this as a cast-via-IO, but you could run into
> semantics issues that way. (If memory serves, we already have cases
> where casting X to Y gives a different result from casting X to text
> to Y.)
>

Yes, I was setting aside the issue of direct cast functions for my v0.1

>
> > My original plan had been two new params to all _in() functions: a
> boolean
> > error_mode and a default expression Datum.
> > After consulting with Jeff Davis and Michael Paquier, the notion of
> > modifying fcinfo itself two booleans:
> > allow_error (this call is allowed to return if there was an error with
> > INPUT) and
> > has_error (this function has the concept of a purely-input-based error,
> > and found one)
>
> Hmm ... my main complaint about that is the lack of any way to report
> the details of the error. I realize that a plain boolean failure flag
> might be enough for our immediate use-cases, but I don't want to expend
> the amount of effort this is going to involve and then later find we
> have a use-case where we need the error details.
>

I agree, but then we're past a boolean for allow_error, and we probably get
into a list of modes like this:

CAST_ERROR_ERROR /* default ereport(), what we do now */
CAST_ERROR_REPORT_FULL /* report that the cast failed, everything that you
would have put in the ereport() instead put in a struct that gets returned
to caller */
CAST_ERROR_REPORT_SILENT /* report that the cast failed, but nobody cares
why so don't even form the ereport strings, good for bulk operations */
CAST_ERROR_WARNING /* report that the cast failed, but emit ereport() as a
warning */
CAST_ERROR_[NOTICE,LOG,DEBUG1,..DEBUG5] /* same, but some other loglevel */

>
> The sketch that's in my head at the moment is to make use of the existing
> "context" field of FunctionCallInfo: if that contains a node of some
> to-be-named type, then we are requesting that errors not be thrown
> but instead be reported by passing back an ErrorData using a field of
> that node. The issue about not constructing an ErrorData if the outer
> caller doesn't need it could perhaps be addressed by adding some boolean
> flag fields in that node, but the details of that need not be known to
> the functions reporting errors this way; it'd be a side channel from the
> outer caller to elog.c.
>

That should be a good place for it, assuming it's not already used like
fn_extra is. It would also squash those cases above into just three: ERROR,
REPORT_FULL, and REPORT_SILENT, leaving it up to the node what type of
erroring/logging is appropriate.

>
> The main objection I can see to this approach is that we only support
> one context value per call, so you could not easily combine this
> functionality with existing use-cases for the context field. A quick
> census of InitFunctionCallInfoData calls finds aggregates, window
> functions, triggers, and procedures, none of which seem like plausible
> candidates for wanting no-error behavior, so I'm not too concerned
> about that. (Maybe the error-reporting node could be made a sub-node
> of the context node in any future cases where we do need it?)
>

A subnode had occurred to me when fiddling about with fn_extra, so so that
applies here, but if we're doing a sub-node, then maybe it's worth it's own
parameter. I struggled with that because of how few of these functions will
use it vs how often they're executed.

>
> > The one gap I see so far in the patch presented is that it returns a null
> > value on bad input, which might be ok if the node has the default, but
> that
> > then presents the node with having to understand whether it was a null
> > because of bad input vs a null that was expected.
>
> Yeah. That's something we could probably get away with for the case of
> input functions only, but I think explicit out-of-band signaling that
> there was an error is a more future-proof solution.
>

I think we'll run into it fairly soon, because if I recall correctly,
SQL/JSON has a formatting spec that essentially means that we're not
calling input functions, we're calling TO_CHAR() and TO_DATE(), but they're
very similiar to input functions.

One positive side effect of all this is we can get a isa(value, typname)
construct like this "for free", we just try the cast and return the value.

I'm still working on my patch even though it will probably be sidelined in
the hopes that it informs us of any subsequent issues.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-12-02 19:15:07 Re: O(n) tasks cause lengthy startups and checkpoints
Previous Message Tom Lane 2022-12-02 18:46:01 Re: Error-safe user functions