Re: Parallel INSERT (INTO ... SELECT ...)

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com>, "Tang, Haiying" <tanghy(dot)fnst(at)cn(dot)fujitsu(dot)com>
Subject: Re: Parallel INSERT (INTO ... SELECT ...)
Date: 2021-03-04 01:45:55
Message-ID: CAJcOf-diJ-PAXMS3KkiX2k_r-+GzXyKaJMkEV7fDTjUYEwHCZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 4, 2021 at 1:07 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Wed, Mar 3, 2021 at 5:50 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> >
> > Asserts are normally only enabled in a debug-build, so for a
> > release-build that Assert has no effect.
> > The Assert is being used as a sanity-check that the function is only
> > currently getting called for INSERT, because that's all it currently
> > supports.
>
> I agree that assert is only for debug build, but once we add and
> assert that means we are sure that it should only be called for insert
> and if it is called for anything else then it is a programming error
> from the caller's side. So after the assert, adding if check for the
> same condition doesn't look like a good idea. That means we think
> that the code can hit assert in the debug mode so we need an extra
> protection in the release mode.
>

The if-check isn't there for "extra protection".
It's to help with future changes; inside that if-block is code only
applicable to INSERT (and to UPDATE - sorry, before I said DELETE), as
the code-comment indicates, whereas the rest of the function is
generic to all command types. I don't see any problem with having this
if-block here, to help in this way, when other command types are
added.

>
> >
> > > 2.
> > > In patch 0004, We are still charging the parallel_tuple_cost for each
> > > tuple, are we planning to do something about this? I mean after this
> > > patch tuple will not be transferred through the tuple queue, so we
> > > should not add that cost.
> > >
> >
> > I believe that for Parallel INSERT, cost_modifytable() will set
> > path->path.rows to 0 (unless there is a RETURNING list), so, for
> > example, in cost_gather(), it will not add to the run_cost as
> > "run_cost += parallel_tuple_cost * path->path.rows;"
> >
>
> But the cost_modifytable is setting the number of rows to 0 in
> ModifyTablePath whereas the cost_gather will multiply the rows from
> the GatherPath. I can not see the rows from GatherPath is ever set to
> 0.
>

OK, I see the problem now.
It works the way I described, but currently there's a problem with
where it's getting the rows for the GatherPath, so there's a
disconnect.
When generating the GatherPaths, it's currently always taking the
rel's (possibly estimated) row-count, rather than using the rows from
the cheapest_partial_path (the subpath: ModifyTablePath). See
generate_gather_paths().
So when generate_useful_gather_paths() is called from the planner, for
the added partial paths for Parallel INSERT, it should be passing
"true" for the "override_rows" parameter, not "false".

So I think that in the 0004 patch, the if-block:

+ if (parallel_modify_partial_path_added)
+ {
+ final_rel->rows = current_rel->rows; /* ??? why
hasn't this been
+
* set above somewhere ???? */
+ generate_useful_gather_paths(root, final_rel, false);
+ }
+

can be reduced to:

+ if (parallel_modify_partial_path_added)
+ generate_useful_gather_paths(root, final_rel, true);
+

Regards,
Greg Nancarrow
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 'Alvaro Herrera' 2021-03-04 01:51:08 Re: PATCH: Batch/pipelining support for libpq
Previous Message Tatsuo Ishii 2021-03-04 01:42:36 Re: PROXY protocol support