Re: [bug?] Missed parallel safety checks, and wrong parallel safety

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date: 2021-06-08 14:51:18
Message-ID: CA+TgmoaY5GuGVDfGi__7nCwj9PfsN0z+Z_K0EcjJURwysNG5zw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 7, 2021 at 11:33 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Note the error is raised after applying the patch, without the patch,
> the above won't show any error (error message could be improved here).
> Such cases can lead to unpredictable behavior without a patch because
> we won't be able to detect the execution of parallel-unsafe functions.
> There are similar examples from regression tests. Now, one way to deal
> with similar cases could be that we document them and say we don't
> consider parallel-safety in such cases and the other way is to detect
> such cases and error out. Yet another way could be that we somehow try
> to check these cases as well before enabling parallelism but I thought
> these cases fall in the similar category as aggregate's support
> functions.

I'm not very excited about the idea of checking type input and type
output functions. It's hard to imagine someone wanting to do something
parallel-unsafe in such a function, unless they're just trying to
prove a point. So I don't think checking it would be a good investment
of CPU cycles. If we do anything at all, I'd vote for just documenting
that such functions should be parallel-safe and that their
parallel-safety marks are not checked when they are used as type
input/output functions. Perhaps we ought to document the same thing
with regard to opclass support functions, another place where it's
hard to imagine a realistic use case for doing something
parallel-unsafe.

In the case of aggregates, I see the issues slightly differently. I
don't know that it's super-likely that someone would want to create a
parallel-unsafe aggregate function, but I think there should be a way
to do it, just in case. However, if somebody wants that, they can just
mark the aggregate itself unsafe. There's no benefit for the user to
marking the aggregate safe and the support functions unsafe and hoping
that the system figures it out somehow.

In my opinion, you're basically taking too pure a view of this. We're
not trying to create a system that does such a good job checking
parallel safety markings that nobody can possibly find a thing that
isn't checked no matter how hard they poke around the dark corners of
the system. Or at least we shouldn't be trying to do that. We should
be trying to create a system that works well in practice, and gives
people the flexibility to easily avoid parallelism when they have a
query that is parallel-unsafe, while still getting the benefit of
parallelism the rest of the time.

I don't know what all the cases you've uncovered are, and maybe
there's something in there that I'd be more excited about changing if
I knew what it was, but the particular problems you're mentioning here
seem more theoretical than real to me.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2021-06-08 14:53:02 Re: GiST operator class for bool
Previous Message Alvaro Herrera 2021-06-08 14:39:24 Re: Move pg_attribute.attcompression to earlier in struct for reduced size?