Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date: 2021-05-06 13:30:33
Message-ID: CAA4eK1Keqx5RMJ0-5PnNEy3b9yj5eEzMtCmR_OCVpRcjaxAMZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 6, 2021 at 4:35 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Thu, May 6, 2021 at 3:00 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > The idea here is to check for parallel safety of functions at
> > someplace in the code during function invocation so that if we execute
> > any parallel unsafe/restricted function via parallel worker then we
> > error out. I think that is a good safety net especially if we can do
> > it with some simple check. Now, we already have pg_proc information in
> > fmgr_info_cxt_security for non-built-in functions, so we can check
> > that and error out if the unsafe function is invoked in parallel mode.
> > It has been observed that we were calling some unsafe functions in
> > parallel-mode in the regression tests which is caught by such a check.
>
> I see your point, but I am not convinced. As I said to Tsunakawa-san,
> doing the check here seems expensive.
>

If I read your email correctly then you are saying it is expensive
based on the idea that we need to perform extra syscache lookup but
actually for non-built-in functions, we already have parallel-safety
information so such a check should not incur a significant cost.

> Also, I had the idea in mind
> that parallel-safety should work like volatility. We don't check at
> runtime whether a volatile function is being called in a context where
> volatile functions are not supposed to be used. If for example you try
> to call a volatile function directly from an index expression I
> believe you will get an error. But if the index expression calls an
> immutable function and then that function internally calls something
> volatile, you don't get an error. Now it might not be a good idea: you
> could end up with a broken index. But that's your fault for
> mislabeling the function you used.
>
> Sometimes this is actually quite useful. You might know that, while
> the function is in general volatile, it is immutable in the particular
> way that you are using it. Or, perhaps, you are using the volatile
> function incidentally and it doesn't affect the output of your
> function at all. Or, maybe you actually want to build an index that
> might break, and then it's up to you to rebuild the index if and when
> that is required. Users do this kind of thing all the time, I think,
> and would be unhappy if we started checking it more rigorously than we
> do today.
>
> Now, I don't see why the same idea can't or shouldn't apply to
> parallel-safety. If you call a parallel-unsafe function in a parallel
> context, it's pretty likely that you are going to get an error, and so
> you might not want to do it. If the function is written in C, it could
> even cause horrible things to happen so that you crash the whole
> backend or something, but I tried to set things up so that for
> built-in functions you'll just get an error. But on the other hand,
> maybe the parallel-unsafe function you are calling is not
> parallel-unsafe in all cases. If you want to create a wrapper function
> that is labelled parallel-safe and try to make that it only calls the
> parallel-unsafe function in the cases where there's no safety problem,
> that's up to you!
>

I think it is difficult to say for what purpose parallel-unsafe
function got called in parallel context so if we give an error in
cases where otherwise it could lead to a crash or caused other
horrible things, users will probably appreciate us. OTOH, if the
parallel-safety labeling is wrong (parallel-safe function is marked
parallel-unsafe) and we gave an error in such a case, the user can
always change the parallel-safety attribute by using Alter Function.
Now, if adding such a check is costly or needs some major re-design
then probably it might not be worth whereas I don't think that is the
case for non-built-in function invocation.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-05-06 13:38:21 Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication
Previous Message Masahiko Sawada 2021-05-06 12:40:18 Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication