Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date: 2021-05-06 11:05:46
Message-ID: CA+Tgmobe8Wrc3X51bP0Onr=U08PM13Via5M9jF2M50sfvjN4Hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 6, 2021 at 3:00 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> The idea here is to check for parallel safety of functions at
> someplace in the code during function invocation so that if we execute
> any parallel unsafe/restricted function via parallel worker then we
> error out. I think that is a good safety net especially if we can do
> it with some simple check. Now, we already have pg_proc information in
> fmgr_info_cxt_security for non-built-in functions, so we can check
> that and error out if the unsafe function is invoked in parallel mode.
> It has been observed that we were calling some unsafe functions in
> parallel-mode in the regression tests which is caught by such a check.

I see your point, but I am not convinced. As I said to Tsunakawa-san,
doing the check here seems expensive. Also, I had the idea in mind
that parallel-safety should work like volatility. We don't check at
runtime whether a volatile function is being called in a context where
volatile functions are not supposed to be used. If for example you try
to call a volatile function directly from an index expression I
believe you will get an error. But if the index expression calls an
immutable function and then that function internally calls something
volatile, you don't get an error. Now it might not be a good idea: you
could end up with a broken index. But that's your fault for
mislabeling the function you used.

Sometimes this is actually quite useful. You might know that, while
the function is in general volatile, it is immutable in the particular
way that you are using it. Or, perhaps, you are using the volatile
function incidentally and it doesn't affect the output of your
function at all. Or, maybe you actually want to build an index that
might break, and then it's up to you to rebuild the index if and when
that is required. Users do this kind of thing all the time, I think,
and would be unhappy if we started checking it more rigorously than we
do today.

Now, I don't see why the same idea can't or shouldn't apply to
parallel-safety. If you call a parallel-unsafe function in a parallel
context, it's pretty likely that you are going to get an error, and so
you might not want to do it. If the function is written in C, it could
even cause horrible things to happen so that you crash the whole
backend or something, but I tried to set things up so that for
built-in functions you'll just get an error. But on the other hand,
maybe the parallel-unsafe function you are calling is not
parallel-unsafe in all cases. If you want to create a wrapper function
that is labelled parallel-safe and try to make that it only calls the
parallel-unsafe function in the cases where there's no safety problem,
that's up to you!

It's possible that I had the wrong idea here, so maybe the question
deserves more thought, but I wanted to explain what my thought process
was.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2021-05-06 11:10:30 Re: MaxOffsetNumber for Table AMs
Previous Message Robert Haas 2021-05-06 10:46:56 Re: [bug?] Missed parallel safety checks, and wrong parallel safety