Re: Error-safe user functions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Joe Conway <mail(at)joeconway(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Error-safe user functions
Date: 2022-12-06 00:18:11
Message-ID: 3518534.1670285891@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> Why is context a void *?

elog.h can't depend on nodes.h, at least not without some rather
fundamental rethinking of our #include relationships. We could
possibly use the same kind of hack that fmgr.h does:

typedef struct Node *fmNodePtr;

but I'm not sure that's much of an improvement. Note that it'd
*not* be correct to declare it as anything more specific than Node*,
since the fmgr context pointer is Node* and we're not expecting
callers to do their own IsA checks to see what they were passed.

> I don't think we should "accept" !IsA(context, ErrorSaveContext) - that
> seems likely to hide things like use-after-free.

No, see above. Moving the IsA checks out to the callers would
not improve the garbage-pointer risk one bit, it would just
add code bloat.

> I'd put a pg_unreachable() or such after the errfinish() call.

[ shrug... ] Kinda pointless IMO, but OK.

> This seems like a fair bit of duplicated code.

I don't think refactoring to remove the duplication would improve it.

>> + * This is the same as InputFunctionCall, but the caller may also pass a
>> + * previously-initialized ErrorSaveContext node. (We declare that as
>> + * "void *" to avoid including miscnodes.h in fmgr.h.)

> It seems way cleaner to forward declare ErrorSaveContext instead of
> using void *.

Again, it cannot be any more specific than Node*. But you're right
that we could use fmNodePtr here, and that would be at least a little
nicer.

> I wonder if we shouldn't instead make InputFunctionCallSafe() return a
> boolean and return the Datum via a pointer. As callers are otherwise
> going to need to do SAFE_ERROR_OCCURRED(escontext) themselves, I think
> it should also lead to more concise (and slightly more efficient) code.

Hmm, maybe. It would be a bigger change from existing code, but
I don't think very many call sites would be impacted. (But by
the same token, we'd not save much code this way.) Personally
I put more value on keeping similar APIs between InputFunctionCall
and InputFunctionCallSafe, but I won't argue hard if you're insistent.

> Is there a reason not to provide this infrastructure for
> ReceiveFunctionCall() as well?

There's a comment in 0003 about that: I doubt that it makes sense
to have no-error semantics for binary input. That would require
far more trust in the receive functions' ability to detect garbage
input than I think they have in reality. Perhaps more to the
point, even if we ultimately do that I don't want to do it now.
Including the receive functions in the first-pass conversion would
roughly double the amount of work needed per datatype, and we are
already going to be hard put to it to finish what needs to be done
for v16.

> Not that I have a suggestion for a better name, but I don't particularly
> like "Safe" denoting non-erroring input function calls. There's too many
> interpretations of safe - e.g. safe against privilege escalation issues
> or such.

Yeah, I'm not that thrilled with it either --- but it's a reasonably
on-point modifier, and short.

> It doesn't *quite* seem right to set ->isnull in case of an error. Not
> that it has an obvious harm.

Doesn't matter: if the caller pays attention to either the Datum
value or the isnull flag, it's broken.

> Wonder if it's perhaps worth to add VALGRIND_MAKE_MEM_UNDEFINED() calls
> to InputFunctionCallSafe() to more easily detect cases where a caller
> ignores that an error occured.

I do not think there are going to be enough callers of
InputFunctionCallSafe that we need such tactics to validate them.

> I'd put that into the if (es_context->error_occurred) path. Likely the
> window for store-forwarding issues is smaller than
> InputFunctionCallSafe(), but it's trivial to write it differently...

Does not seem better to me, and your argument for it seems like the
worst sort of premature micro-optimization.

> Think it'd be good to have a test for a composite type where one of the
> columns safely errors out and the other doesn't.

I wasn't trying all that hard on the error tests, because I think
0003 is just throwaway code at this point. If we want to seriously
check the input functions' behavior then we need to factorize the
tests so it can be done per-datatype, not in one central place in
the COPY tests. For the core types it could make sense to provide
some function in pg_regress.c that allows access to the non-exception
code path independently of COPY; but I'm not sure how contrib
datatypes could use that.

In any case, I'm unconvinced that testing each error exit both ways is
likely to be a profitable use of test cycles. The far more likely source
of problems with this patch series is going to be that we miss converting
some ereport call that is reachable with bad input. No amount of
testing is going to prove that that didn't happen.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-12-06 00:22:47 Re: Transaction timeout
Previous Message Michael Paquier 2022-12-06 00:16:17 Re: [PATCH] Add native windows on arm64 support