Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-02-24 12:51:26
Message-ID: CAJcOf-cbcKE1Ws2xztod9oGQ5B1w2J5MThBaP294tkHTB=9cMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 24, 2021 at 10:38 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > Thanks, I'll try it.
> > I did, however, notice a few concerns with your suggested alternative fix:
> > - It is not restricted to INSERT (as current fix is).
> >
>
> So what? The Select also has a similar problem.
>

Yes, but you're potentially now adding overhead to every
SELECT/UPDATE/DELETE with a subquery, by the added recursive checking
and walking done by the new call to max_parallel_hazard_walker().and
code block looking for a modifying CTE
And anyway I'm not sure it's really right putting in a fix for SELECT
with a modifying CTE, into a patch that adds parallel INSERT
functionality - in any case you'd need to really spell this out in
code comments, as this is at best a temporary fix that would need to
be removed whenever the query rewriter is fixed to set hasModifyingCTE
correctly.

> > - It does not set parse->hasModifyingCTE (as current fix does), so the
> > return value (PlannedStmt) from standard_planner() won't have
> > hasModifyingCTE set correctly in the cases where the rewriter doesn't
> > set it correctly (and I'm not sure what strange side effects ??? that
> > might have).
>
> Here end goal is not to set hasModifyingCTE but do let me know if you
> see any problem or impact.
>

parse->hasModifyingCTE is not just used in the shortcut-test for
parallel-safety, its value is subsequently copied into the PlannedStmt
returned by standard_planner.
It's inconsistent to leave hasModifyingCTE FALSE when by the fix it
has found a modifying CTE.
Even if no existing tests detect an issue with this, PlannedStmt is
left with a bad hasModifyingCTE value in this case, so there is the
potential for something to go wrong.

> > - Although the invocation of max_parallel_hazard_walker() on a RTE
> > subquery will "work" in finally locating your fix's added
> > "CommonTableExpr" parallel-safety disabling block for commandType !=
> > CMD_SELECT, it looks like it potentially results in checking and
> > walking over a lot of other stuff within the subquery not related to
> > CTEs. The current fix does a more specific and efficient search for a
> > modifying CTE.
> >
>
> I find the current fix proposed by you quite ad-hoc and don't think we
> can go that way.
>

At least my current fix is very specific, efficient and clear in its
purpose, and suitably documented, so it is very clear when and how it
is to be removed, when the issue is fixed in the query rewriter.
Another concern with the alternative fix is that it always searches
for a modifying CTE, even when parse->hasModifyingCTE is true after
the query rewriter processing.

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2021-02-24 12:59:47 Re: [BUG] segfault during delete
Previous Message Amit Kapila 2021-02-24 12:25:54 Re: Single transaction in the tablesync worker?