Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-01-22 07:16:16
Message-ID: CAJcOf-f2xvSAzgUGdst3x4_WvpiAH_rKyjour8cAZ1qHtHkYkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 22, 2021 at 4:49 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> >
> > Unfortunately, this change results in a single test failure in the
> > "with" tests when "force_parallel_mode=regress" is in effect.
> >
> > I have reproduced the problem, by extracting relevant SQL from those
> > tests, as follows:
> >
> > CREATE TEMP TABLE bug6051 AS
> > select i from generate_series(1,3) as i;
> > SELECT * FROM bug6051;
> > CREATE TEMP TABLE bug6051_2 (i int);
> > CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD
> > INSERT INTO bug6051_2
> > SELECT NEW.i;
> > WITH t1 AS ( DELETE FROM bug6051 RETURNING * )
> > INSERT INTO bug6051 SELECT * FROM t1;
> > ERROR: cannot delete tuples during a parallel operation
> >
> > Note that prior to the patch, all INSERTs were regarded as
> > PARALLEL_UNSAFE, so this problem obviously didn't occur.
> > I believe this INSERT should be regarded as PARALLEL_UNSAFE, because
> > it contains a modifying CTE.
> > However, for some reason, the INSERT is not regarded as having a
> > modifying CTE, so instead of finding it PARALLEL_UNSAFE, it falls into
> > the parallel-safety-checks and is found to be PARALLEL_RESTRICTED:
> >
> > The relevant code in standard_planner() is:
> >
> > if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
> > IsUnderPostmaster &&
> > (parse->commandType == CMD_SELECT ||
> > IsModifySupportedInParallelMode(parse->commandType)) &&
> > !parse->hasModifyingCTE &&
> > max_parallel_workers_per_gather > 0 &&
> > !IsParallelWorker())
> > {
> > /* all the cheap tests pass, so scan the query tree */
> > glob->maxParallelHazard = max_parallel_hazard(parse);
> > glob->parallelModeOK = (glob->maxParallelHazard != PROPARALLEL_UNSAFE);
> > }
> > else
> > {
> > /* skip the query tree scan, just assume it's unsafe */
> > glob->maxParallelHazard = PROPARALLEL_UNSAFE;
> > glob->parallelModeOK = false;
> > }
> >
> > When I debugged this (transformWithClause()), the WITH clause was
> > found to contain a modifying CTE and for the INSERT
> > query->hasModifyingCTE was set true.
> > But somehow in the re-writer code, this got lost.
> > Bug?
> > Ideas?
> >
>
> How it behaves when the table in the above test is a non-temp table
> with your patch? If it leads to the same error then we can at least
> conclude that this is a generic problem and nothing specific to temp
> tables.
>

Oh, I don't believe that this has anything to do with TEMP tables -
it's just that when I relaxed the parallel-safety level on TEMP
tables, it exposed the CTE issue in this test case because it just
happens to use a TEMP table.
Having said that, when I changed that test code to not use a TEMP
table, an Assert fired in the planner code and caused the backend to
abort.
It looks like I need to update the following Assert in the planner
code (unchanged by the current patch) in order to test further - but
this Assert only fired because the commandType was CMD_DELETE, which
SHOULD have been excluded by the "hasModifyingCTE" test on the parent
INSERT, which is what I'm saying is strangely NOT getting set.

/*
* Generate partial paths for final_rel, too, if outer query levels might
* be able to make use of them.
*/
if (final_rel->consider_parallel && root->query_level > 1 &&
!limit_needed(parse))
{
Assert(!parse->rowMarks && parse->commandType == CMD_SELECT);
foreach(lc, current_rel->partial_pathlist)
{
Path *partial_path = (Path *) lfirst(lc);

add_partial_path(final_rel, partial_path);
}
}

Once the Assert above is changed to not test the commandType, the same
error occurs as before.

This appears to possibly be some kind of bug in which hasModifyingCTE
is not getting set, at least in this particular INSERT case, but in
the current Postgres code it doesn't matter because all INSERTs are
considered parallel-unsafe, so won't be executed in parallel-mode
anyway.
I notice that if I execute "SELECT * from t1" instead of "INSERT INTO
bug6051 SELECT * from t1", then "hasModifyingCTE" is getting set to
true for the query, and thus is always considered parallel-unsafe.

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hou, Zhijie 2021-01-22 07:21:33 RE: Parallel INSERT (INTO ... SELECT ...)
Previous Message Bharath Rupireddy 2021-01-22 06:42:37 Re: Identify missing publications from publisher while create/alter subscription.