[bug?] Missed parallel safety checks, and wrong parallel safety

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: [bug?] Missed parallel safety checks, and wrong parallel safety
Date: 2021-04-20 08:52:46
Message-ID: TYAPR01MB29900259117BD4F1C36F0D5BFE489@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I think we've found a few existing problems with handling the parallel safety of functions while doing an experiment. Could I hear your opinions on what we should do? I'd be willing to create and submit a patch to fix them.

The experiment is to add a parallel safety check in FunctionCallInvoke() and run the regression test with force_parallel_mode=regress. The added check errors out with ereport(ERROR) when the about-to-be-called function is parallel unsafe and the process is currently in parallel mode. 6 test cases failed because the following parallel-unsafe functions were called:

dsnowball_init
balkifnull
int44out
text_w_default_out
widget_out

The first function is created in src/backend/snowball/snowball_create.sql for full text search. The remaining functions are created during the regression test run.

The relevant issues follow.

(1)
All the above functions are actually parallel safe looking at their implementations. It seems that their CREATE FUNCTION statements are just missing PARALLEL SAFE specifications, so I think I'll add them. dsnowball_lexize() may also be parallel safe.

(2)
I'm afraid the above phenomenon reveals that postgres overlooks parallel safety checks in some places. Specifically, we noticed the following:

* User-defined aggregate
CREATE AGGREGATE allows to specify parallel safety of the aggregate itself and the planner checks it, but the support function of the aggregate is not checked. OTOH, the document clearly says:

https://www.postgresql.org/docs/devel/xaggr.html

"Worth noting also is that for an aggregate to be executed in parallel, the aggregate itself must be marked PARALLEL SAFE. The parallel-safety markings on its support functions are not consulted."

https://www.postgresql.org/docs/devel/sql-createaggregate.html

"An aggregate will not be considered for parallelization if it is marked PARALLEL UNSAFE (which is the default!) or PARALLEL RESTRICTED. Note that the parallel-safety markings of the aggregate's support functions are not consulted by the planner, only the marking of the aggregate itself."

Can we check the parallel safety of aggregate support functions during statement execution and error out? Is there any reason not to do so?

* User-defined data type
The input, output, send,receive, and other functions of a UDT are not checked for parallel safety. Is there any good reason to not check them other than the concern about performance?

* Functions for full text search
Should CREATE TEXT SEARCH TEMPLATE ensure that the functions are parallel safe? (Those functions could be changed to parallel unsafe later with ALTER FUNCTION, though.)

(3) Built-in UDFs are not checked for parallel safety
The functions defined in fmgr_builtins[], which are derived from pg_proc.dat, are not checked. Most of them are marked parallel safe, but some are paralel unsaferestricted.

Besides, changing their parallel safety with ALTER FUNCTION PARALLEL does not affect the selection of query plan. This is because fmgr_builtins[] does not have a member for parallel safety.

Should we add a member for parallel safety in fmgr_builtins[], and disallow ALTER FUNCTION to change the parallel safety of builtin UDFs?

Regards
Takayuki Tsunakawa

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-04-20 09:36:30 Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Previous Message Amit Langote 2021-04-20 08:51:58 Re: Table refer leak in logical replication