Fix erroneous parallel execution when modifying CTE is present in rewritten query

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Fix erroneous parallel execution when modifying CTE is present in rewritten query
Date: 2021-08-27 04:55:34
Message-ID: CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Hackers,

There is a known bug in the query rewriter: if a query that has a modifying
CTE is re-written, the hasModifyingCTE flag is not getting set in the
re-written query. This bug can result in the query being allowed to execute
in parallel-mode, which results in an error.

For more details from a previous discussion about this, and a test case
that illustrates the issue, refer to:
https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com

As a proposal to fix this problem, I've attached a patch which:

1) Copies the associated hasModifyingCTE and hasRecursive flags when the
rewriter combines CTE lists (using Tom Lane's initial patch code seen in
[1]). This flag copying is missing from the current Postgres code.
2) Adds an error case to specifically disallow the case of applying an
INSERT...SELECT rule action to a command with a data-modifying CTE. This is
because in this case, the rewritten query will actually end up having a
data-modifying CTE that is not at the top level (as it is associated with
the SELECT subquery part), which is not actually allowed by Postgres if
that query is entered normally (as it's the parser that contains the
error-check to ensure that the modifying CTE is at the top level, so this
case avoids detection in the rewriter).
3) Modifies the existing test case in with.sql that tests the merging of an
outer CTE with a CTE in a rule action (as currently that rule action is
using INSERT...SELECT).

For the record, a workaround for this issue (at least addressing how
hasModifyingCTE is meant to exclude the query from parallel execution) has
been suggested in the past, but was not well received. It is the following
addition to the max_parallel_hazard_walker() function:

+ /*
+ * ModifyingCTE expressions are treated as parallel-unsafe.
+ *
+ * XXX Normally, if the Query has a modifying CTE, the
hasModifyingCTE
+ * flag is set in the Query tree, and the query will be
regarded as
+ * parallel-usafe. However, in some cases, a re-written query
with a
+ * modifying CTE does not have that flag set, due to a bug in
the query
+ * rewriter. The following else-if is a workaround for this
bug, to detect
+ * a modifying CTE in the query and regard it as
parallel-unsafe. This
+ * comment, and the else-if block immediately below, may be
removed once
+ * the bug in the query rewriter is fixed.
+ */
+ else if (IsA(node, CommonTableExpr))
+ {
+ CommonTableExpr *cte = (CommonTableExpr *)
node;
+ Query *ctequery = castNode(Query,
cte->ctequery);
+
+ if (ctequery->commandType != CMD_SELECT)
+ {
+ context->max_hazard =
PROPARALLEL_UNSAFE;
+ return true;
+ }
+ }
+

Regards,
Greg Nancarrow
Fujitsu Australia

Attachment Content-Type Size
v1-0001-Propagate-CTE-property-flags-when-copying-a-CTE-list.patch application/octet-stream 6.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-08-27 04:57:45 Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead
Previous Message Michael Paquier 2021-08-27 04:41:39 Re: [PATCH] Disable bgworkers during servers start in pg_upgrade